AWS Kinesis

 · 1 min read

Kinesis

Load and analyze streaming data; Ingest real-time data into data lakes and data warehouses

Usage patterns

  • Move data from producers and continuously process it to transform before moving to another data store; drive real-time metrics and analytics
  • Real-time data analytics
  • Log intake and processing
  • Real-time metrics and reporting
  • Video/Audio processing

Cost

  • Pay for the resources consumed
  • Data Streams hourly price per/shard
  • Charge for each 1 million PUT transactions

Performance

  • Data Streams: throughput capacity by number of shards
  • Provision as many shards as needed

  • Synchronously replicates data across three AZs

Durability and availability

  • Highly available and durable due to config of multiple AZs in one region
  • Use cursor in DynamoDB to restart failed apps
  • Resume at exact position in the stream where failure occured

Scalability and elasticity

  • Use API calls to automate scaling, increase or decrease stream capacity at any time

Interfaces

  • Two interfaces:
    • Input (KPL, agend, PUT API)
    • Output (KCL)
    • Kinesis Storm Spout: read from an Kinesis stream into Apache Storm

Anti-patterns

  • Small scale consistent throughput
  • Long-term data storage and analytics; Redshift, S3, or DynamoDB are better choices

Analytics Solutions Patterns with Kinesis

Streams data to analytics processing solutions

  • Video analytics applications
  • Real-time analytics applications
  • Analyze IoT device data
  • Blog posts and article analytics
  • System and application log analytics