Course Outline

Introduction to Apache Iceberg

  • Overview of Apache Iceberg
  • Review of basic concepts

Deep Dive into Iceberg Architecture

  • In-depth analysis of Iceberg's table format
  • Detailed architecture overview, including metadata and file layout
  • Internals of schema and partition evolution​

Advanced Installation and Configuration

  • Configuring Iceberg for optimal performance in different environments
  • Integration with various data processing engines
  • Advanced setup: security, encryption, and access controls
  • Setting up Iceberg in a distributed environment

Advanced Operations and Maintenance

  • Managing large-scale Iceberg tables
  • Implementing and managing complex schema changes
  • Handling partition evolution and hidden partitioning
  • Advanced CRUD operations with schema and partition changes

Query Optimization Techniques

  • Techniques for reducing query latency
  • Partition pruning and file pruning
  • Metadata caching and optimization strategies
  • Implementing and testing query optimization techniques​

Performance Tuning for Large Datasets

  • Optimizing performance for large-scale datasets
  • Using Iceberg's built-in features for performance tuning
  • Case studies on performance tuning in real-world scenarios
  • Tuning performance for large-scale datasets

Advanced Data Migration and Integration

  • Migrating complex data structures from other systems
  • Integrating Iceberg with real-time data streams
  • Migrating complex datasets and integrating real-time data streams​

Reliability and Consistency

  • Ensuring data consistency and integrity in distributed environments
  • Implementing and managing transactional guarantees
  • Handling failures and recovery mechanisms
  • Implementing reliability and consistency features​

Advanced Features and Customization

  • Custom catalog implementations
  • Extending Iceberg with custom features
  • Implementing custom catalog and extending Iceberg functionalities​

Data Governance and Compliance

  • Implementing data governance policies
  • Compliance with data regulations
  • Managing audit trails and data lineage
  • Implementing governance and compliance features​

Summary and Next Steps

Requirements

  • Familiarity with core concepts, basic operations, and Iceberg table management

Audience

  • Data engineers
  • Data architects
  • Data analysts
  • Software developers
 21 Hours

Number of participants


Price per participant

Testimonials (4)

Upcoming Courses

Related Categories