AWS Data Analytics Solutions
Type of Analysis
Descriptive analysis (or data mining)
- Determine what generated the data
- Highest effort
Diagnostic analysis
- Determine why data was generated
- Understand root causes of events
Predictive analysis
- Determine future outcomes
- Uses descriptive and diagnostic to predict future trends
Prescriptive analysis
- Determine action to take
- Uses other three to predict and can be automated
Analytics Processing Methods
Identify analytics processing method based on data type collected and analysis type used
Batch analytics
- Large volumes of raw data
- Analytics process on a schedule, reports
- Map-reduce type services: EMR
Interactive analytics
- Complex queries on complex data at high speed
- See query results immediately
- Athena, Elasticsearch, Redshift
Streaming analytics
- Analysis of data that has short shelf-life
- Incrementally ingest data and update metrics
- Kinesis
Analytics Solutions Patterns
Select the best option for a scenario based on the type of analytics and processing required
Analytics Solutions Patterns - EMR
Uses the map-reduce technique to reduce large processing problems into small jobs distributed across many nodes in a Hadoop cluster
- On-Demand big data analyitcs
- Event-driven ETL
- Machine Learning predictive analytics
- Clickstream analysis
- Load data warehousees
Do not use for transactional processing or with small data sets
Analytics Solutions Patterns - Kinesis
Streams data to analytics processing solutions
- Video analytics applications
- Real-time analytics applications
- Analyze IoT device data
- Blog posts and article analytics
- System and application log analytics
Do not use for small-scale throughput or with data with longer shelf-life
Analytics Solutions Patterns - Redshift
OLAP using BI tools
- Near real-time analysis of millions of rows of manufacturing data generated by continous manufacturing equipment
- Analyze events from mobile app to gain insight into how users use the application
- Gain value and insights from large, complex, and dispersed datasets
- Make live data generated by range of next-gen security solutions available to large numbers of organizations for analysis
Do not use for OLTP or with small data sets