Profile and explore datasets to understand their shape, quality, and patterns before analysis. Use when encountering a new dataset, assessing data quality, discovering column distributions, identifying nulls and outliers, or deciding which dimensions to analyze.
Published by rebyteai
Runs in the cloud
No local installation
Dependencies pre-installed
Ready to run instantly
Secure VM environment
Isolated per task
Works on any device
Desktop, tablet, or phone
Systematic methodology for profiling datasets, assessing data quality, discovering patterns, and understanding schemas.
Before analyzing any data, understand its structure:
Table-level questions:
Column classification: Categorize each column as one of:
For each column, compute:
All columns:
Numeric columns (metrics):
min, max, mean, median (p50)
standard deviation
percentiles: p1, p5, p25, p75, p95, p99
zero count
negative count (if unexpected)
String columns (dimensions, text):
min length, max length, avg length
empty string count
pattern analysis (do values follow a format?)
case consistency (all upper, all lower, mixed?)
leading/trailing whitespace count
Date/timestamp columns:
min date, max date
null dates
future dates (if unexpected)
distribution by month/week
gaps in time series
Boolean columns:
true count, false count, null count
true rate
After profiling individual columns:
Rate each column:
Look for:
Red flags that suggest accuracy issues:
For numeric columns, characterize the distribution:
For time series data, look for:
Identify natural segments by:
Between numeric columns:
When documenting a dataset for team use:
## Table: [schema.table_name]
**Description**: [What this table represents]
**Grain**: [One row per...]
**Primary Key**: [column(s)]
**Row Count**: [approximate, with date]
**Update Frequency**: [real-time / hourly / daily / weekly]
**Owner**: [team or person responsible]
### Key Columns
| Column | Type | Description | Example Values | Notes |
|--------|------|-------------|----------------|-------|
| user_id | STRING | Unique user identifier | "usr_abc123" | FK to users.id |
| event_type | STRING | Type of event | "click", "view", "purchase" | 15 distinct values |
| revenue | DECIMAL | Transaction revenue in USD | 29.99, 149.00 | Null for non-purchase events |
| created_at | TIMESTAMP | When the event occurred | 2024-01-15 14:23:01 | Partitioned on this column |
### Relationships
- Joins to `users` on `user_id`
- Joins to `products` on `product_id`
- Parent of `event_details` (1:many on event_id)
### Known Issues
- [List any known data quality issues]
- [Note any gotchas for analysts]
### Common Query Patterns
- [Typical use cases for this table]
When connected to a data warehouse, use these patterns to discover schema:
-- List all tables in a schema (PostgreSQL)
SELECT table_name, table_type
FROM information_schema.tables
WHERE table_schema = 'public'
ORDER BY table_name;
-- Column details (PostgreSQL)
SELECT column_name, data_type, is_nullable, column_default
FROM information_schema.columns
WHERE table_name = 'my_table'
ORDER BY ordinal_position;
-- Table sizes (PostgreSQL)
SELECT relname, pg_size_pretty(pg_total_relation_size(relid))
FROM pg_catalog.pg_statio_user_tables
ORDER BY pg_total_relation_size(relid) DESC;
-- Row counts for all tables (general pattern)
-- Run per-table: SELECT COUNT(*) FROM table_name
When exploring an unfamiliar data environment:
Everyone else asks you to install skills locally. On Rebyte, just click Run. Works from any device — even your phone. No CLI, no terminal, no configuration.
Claude Code
Gemini CLI
Codex
Cursor, Windsurf, Amp
Creating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.
Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply.
Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid copyright violations.
Guide users through a structured workflow for co-authoring documentation. Use when user wants to write documentation, proposals, technical specs, decision docs, or similar structured content. This workflow helps users efficiently transfer context, refine content through iteration, and verify the doc works for readers. Trigger when user mentions writing docs, creating proposals, drafting specs, or similar documentation tasks.
rebyte.ai — The only platform where you can run AI agent skills directly in the cloud
No downloads. No configuration. Just sign in and start using AI skills immediately.
Use this skill in Agent Computer — your shared cloud desktop with all skills pre-installed. Join Moltbook to connect with other teams.