Exploratory Data Analysis (EDA) — Systematic Prompt

Step-by-step EDA workflow that produces a comprehensive data profile with distributions, correlations, and anomalies.

by Promptsy Team

987 views345 copies

+110

Data & Analysis #eda #system-prompt #chain-of-thought #data-viz

Prompt Discussion

Prompt

Perform a systematic exploratory data analysis on [dataset description]. Follow this exact workflow:

**Step 1 — Shape & Schema:**
- How many rows and columns?
- Data types per column (numeric, categorical, datetime, text)
- Memory usage estimate

**Step 2 — Missing Data:**
- Missing value count and percentage per column
- Pattern analysis: are missing values random or systematic?
- Recommended handling strategy per column (drop/impute/flag)

**Step 3 — Distributions:**
- For numeric columns: mean, median, std, min, max, skewness
- For categorical columns: unique count, top 5 values with frequencies
- Flag any columns with >95% single value (near-zero variance)

**Step 4 — Correlations & Relationships:**
- Top 10 strongest correlations (positive and negative)
- Flag multicollinearity (|r| > 0.8)
- Categorical vs numeric: group-by means for key categories

**Step 5 — Anomalies & Outliers:**
- IQR-based outlier detection for numeric columns
- Impossible values (negative ages, future dates, etc.)
- Duplicate row analysis

**Step 6 — Summary & Recommendations:**
- Top 3 most interesting patterns discovered
- Data quality score (1-10) with justification
- Suggested next steps for deeper analysis

**Output format**: Structured report with code snippets in [Python/R/SQL].

Compatible models

Claude 4 Opus GPT-4o Gemini 2.5 Pro

Gallery (0)

No gallery images yet.

Version history

Discussion

Start a discussion about this prompt