AWS Athena Connection

Jedify's Semantic Fusion™ platform seamlessly integrates with AWS Athena to bring natural language analytics to your Iceberg data lake. This enables processing millions of queries monthly on Iceberg tables while optimizing cost and performance through intelligent query planning.

Prerequisites

Before connecting Jedify to your Athena environment, ensure you have:

✅ AWS account with Athena enabled
✅ Iceberg tables registered in AWS Glue Data Catalog
✅ S3 bucket for Athena query results (staging directory)
✅ AWS credentials with appropriate permissions (see Required AWS Permissions)

Authentication Method

Jedify supports programmatic access via IAM Access Keys for connecting to your Athena environment. This approach aligns with AWS Athena best practices and provides a simple, secure, and auditable setup.

1. Create IAM Policy for S3 Data Lake Access

Create a custom IAM policy that allows read access to the S3 bucket(s) containing your Iceberg data.

Note: Bucket-level and object-level permissions must both be included. You may include multiple data lake buckets if needed.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["s3:GetObject"],
      "Resource": ["arn:aws:s3:::your-data-lake-bucket/*"]
    },
    {
      "Effect": "Allow",
      "Action": ["s3:GetBucketLocation", "s3:ListBucket"],
      "Resource": ["arn:aws:s3:::your-data-lake-bucket"]
    }
  ]
}

2. Create IAM User with Programmatic Access

Go to AWS Console → IAM → Users → Add user
User name: jedify-athena-user
Select Programmatic access
Attach the following policies:
- AWSQuicksightAthenaAccess
- Custom S3 policy for data lake access (above)
- Custom S3 policy for query results bucket (below)
Create the user and save the Access Key ID and Secret Access Key

3. Create / Identify S3 Bucket for Athena Query Results

Athena requires a staging bucket for query results.

Create a new bucket (e.g. my-athena-results) or reuse an existing one
Ensure the bucket is in the same AWS region as Athena
Staging directory format:

s3://my-athena-results/jedify/

4. Update IAM Policy for Query Results Bucket

The managed policy AWSQuicksightAthenaAccess only covers buckets named:

arn:aws:s3:::aws-athena-query-results-*

If you use a custom bucket, add the following permissions:

{
  "Effect": "Allow",
  "Action": [
    "s3:GetBucketLocation",
    "s3:GetObject",
    "s3:ListBucket",
    "s3:ListBucketMultipartUploads",
    "s3:ListMultipartUploadParts",
    "s3:AbortMultipartUpload",
    "s3:CreateBucket",
    "s3:PutObject"
  ],
  "Resource": [
    "arn:aws:s3:::your-query-results-bucket",
    "arn:aws:s3:::your-query-results-bucket/*"
  ]
}

5. Add Glue Data Catalog Permissions

If your Athena tables are registered in AWS Glue (recommended), add:

{
  "Effect": "Allow",
  "Action": [
    "glue:GetDatabase",
    "glue:GetDatabases",
    "glue:GetTable",
    "glue:GetTables",
    "glue:GetPartition",
    "glue:GetPartitions"
  ],
  "Resource": "*"
}

Required AWS Permissions (Consolidated)

Jedify requires the following logical permission groups:

Athena

Start, monitor, and retrieve query results

S3 (Data Lake)

Read-only access to Iceberg table buckets

S3 (Query Results)

Read/write access to Athena staging bucket

AWS Glue

Read access to databases, tables, and partitions

Connection Configuration

Provide the following to Jedify:

Required

Parameter	Description	Example
AWS Region	Athena region	us-east-1
S3 Staging Directory	Athena query results path	s3://my-athena-results/jedify/

Authentication

IAM Access Keys

Access Key ID
Secret Access Key

Optional

Workgroup (default: primary)
Catalog Name (default: AwsDataCatalog)

Iceberg Tables in Jedify

Jedify automatically discovers:

Databases in Glue
Tables (Iceberg and non-Iceberg)
Column schemas and types
Partition columns
Table statistics

Iceberg tables are queried like standard SQL tables. No special syntax is required.

Semantic Models and Query Optimization

Jedify translates semantic models into Athena-compatible SQL and automatically adds partition filters when applicable.

Optimization Example

If day is a partition column, Jedify:

Detects it from Iceberg metadata
Adds default date filters (for example, last 30 or 90 days)
Reduces scanned data and query cost by 10–100x

Pro Tips

Athena charges per TB scanned. Use partitioning and columnar formats (Parquet)
Ensure S3 buckets and Athena are in the same region
Use lifecycle policies to clean up old query results
Always include partition filters on Iceberg tables
The S3 staging directory must end with a trailing slash