AWS Athena

 · 1 min read

AWS Athena

Interactive query service used to analyze data in S3 using Presto and standard SQL

Usage pattern

  • Interactive ad-hoc querying for web logs
  • Query staging data before loading into Redshift
  • Send AWS service logs to S3 for Analysis with Athena
  • Integrate with Jupyter, Zeppelin

Cost

  • $5 per TB of query data scanned
  • Save on per-query costs and get better performance by compressing, partitioning, and converting data into columnar formats

Performance

  • Compressing, partitioning, and converting your data into columnar formats
  • Convert data to columnar formats, allowing Athena to read only the columns it needs to process queries

  • Executes queries using compute resources across multiple facilites

Durability and availablity

  • Automatically routes queries if a particular facility is unreachable
  • S3 is the underlying data store, gaining S3’s 11 9s durability

Scalability and elasticity

  • Serverless, scales automatically as needed

Interfaces

  • Athena CLI, API, via SDK, and JDBC
  • QuickSight visualizations

Anti-patterns

  • Enterprise Reporting and Business Intelligence Workloads; Redshift better choice
  • ETL Workloads; EMR and Glue better choice
  • Not a replacement for RDBMS