Skip to content

Commit bdc7ce2

Browse files
authored
Document Athena S3 Tables query support (#611)
1 parent 66135e6 commit bdc7ce2

2 files changed

Lines changed: 169 additions & 0 deletions

File tree

src/content/docs/aws/services/athena.mdx

Lines changed: 162 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -218,6 +218,168 @@ s3://mybucket/prefix/metadata/snap-9068645333036463050-1-2f8d3628-bb13-4081-b5a9
218218
s3://mybucket/prefix/temp/
219219
```
220220
221+
## S3 Tables
222+
223+
LocalStack Athena can query [S3 Tables](/aws/services/s3tables/) through Glue federated catalogs, mirroring the AWS workflow that bridges S3 Tables, Glue, and Athena into a single query path.
224+
This lets you point Athena at a table bucket and run SQL against the Iceberg tables it manages without copying data into a separate warehouse.
225+
226+
The flow is the same as on AWS:
227+
228+
1. Create a table bucket and namespaces in S3 Tables.
229+
2. Register a Glue federated catalog (conventionally named `s3tablescatalog`) that delegates metadata to S3 Tables.
230+
3. Register an Athena data catalog with `Type=GLUE` whose `catalog-id` parameter points to a specific table bucket via the federated catalog (`s3tablescatalog/<bucket-name>`).
231+
4. Reference the Athena data catalog in `QueryExecutionContext` when running queries.
232+
233+
### Create S3 Tables resources
234+
235+
Create a table bucket and a namespace in S3 Tables.
236+
The bucket holds your Iceberg tables and the namespace organizes them.
237+
238+
```bash
239+
awslocal s3tables create-table-bucket --name athena-doc-bucket
240+
```
241+
242+
```bash title="Output"
243+
{
244+
"arn": "arn:aws:s3tables:us-east-1:000000000000:bucket/athena-doc-bucket"
245+
}
246+
```
247+
248+
```bash
249+
awslocal s3tables create-namespace \
250+
--table-bucket-arn arn:aws:s3tables:us-east-1:000000000000:bucket/athena-doc-bucket \
251+
--namespace sales
252+
```
253+
254+
```bash title="Output"
255+
{
256+
"tableBucketARN": "arn:aws:s3tables:us-east-1:000000000000:bucket/athena-doc-bucket",
257+
"namespace": [
258+
"sales"
259+
]
260+
}
261+
```
262+
263+
### Register a Glue federated catalog
264+
265+
Register a Glue catalog that federates to S3 Tables using the [`CreateCatalog`](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-catalog-Catalogs.html#aws-glue-api-catalog-CreateCatalog) API.
266+
The catalog name `s3tablescatalog` matches the AWS convention used by Athena, EMR, and Redshift.
267+
268+
```bash
269+
awslocal glue create-catalog \
270+
--name s3tablescatalog \
271+
--catalog-input '{
272+
"FederatedCatalog": {
273+
"Identifier": "arn:aws:s3tables:us-east-1:000000000000:bucket/*",
274+
"ConnectionName": "aws:s3tables"
275+
}
276+
}'
277+
```
278+
279+
You can verify the federated catalog with:
280+
281+
```bash
282+
awslocal glue get-catalogs
283+
```
284+
285+
### Register an Athena data catalog
286+
287+
Register an Athena data catalog that points at a specific table bucket using the [`CreateDataCatalog`](https://docs.aws.amazon.com/athena/latest/APIReference/API_CreateDataCatalog.html) API.
288+
The `catalog-id` parameter follows the format `s3tablescatalog/<bucket-name>` so that Athena routes queries through the federated catalog path.
289+
290+
```bash
291+
awslocal athena create-data-catalog \
292+
--name s3tables-catalog \
293+
--type GLUE \
294+
--parameters "catalog-id=s3tablescatalog/athena-doc-bucket"
295+
```
296+
297+
Confirm the data catalog status:
298+
299+
```bash
300+
awslocal athena get-data-catalog --name s3tables-catalog
301+
```
302+
303+
```bash title="Output"
304+
{
305+
"DataCatalog": {
306+
"Name": "s3tables-catalog",
307+
"Type": "GLUE",
308+
"Parameters": {
309+
"catalog-id": "s3tablescatalog/athena-doc-bucket"
310+
},
311+
"Status": "CREATE_COMPLETE"
312+
}
313+
}
314+
```
315+
316+
### Resolve metadata through the catalog
317+
318+
Once the data catalog is registered, Athena resolves S3 Tables namespaces as databases and S3 Tables as tables.
319+
List the databases exposed by the federated catalog:
320+
321+
```bash
322+
awslocal athena list-databases --catalog-name s3tables-catalog
323+
```
324+
325+
```bash title="Output"
326+
{
327+
"DatabaseList": [
328+
{
329+
"Name": "sales",
330+
"Parameters": {
331+
"createdBy": "000000000000",
332+
"ownerAccountId": "000000000000"
333+
}
334+
}
335+
]
336+
}
337+
```
338+
339+
You can also describe a single namespace with [`GetDatabase`](https://docs.aws.amazon.com/athena/latest/APIReference/API_GetDatabase.html):
340+
341+
```bash
342+
awslocal athena get-database \
343+
--catalog-name s3tables-catalog \
344+
--database-name sales
345+
```
346+
347+
### Run queries via the federated catalog
348+
349+
To query S3 Tables data from Athena, reference the data catalog name in the `QueryExecutionContext`.
350+
The `Catalog` field maps to the Athena data catalog you registered, and `Database` maps to the S3 Tables namespace:
351+
352+
```bash
353+
awslocal athena start-query-execution \
354+
--query-string "CREATE TABLE orders (id int, customer string, amount double) TBLPROPERTIES ('table_type' = 'ICEBERG')" \
355+
--query-execution-context "Catalog=s3tables-catalog,Database=sales" \
356+
--result-configuration "OutputLocation=s3://athena-doc-output/results/"
357+
```
358+
359+
Insert and read data using the same `QueryExecutionContext`:
360+
361+
```bash
362+
awslocal athena start-query-execution \
363+
--query-string "INSERT INTO orders VALUES (1, 'alice', 100.0), (2, 'bob', 250.5)" \
364+
--query-execution-context "Catalog=s3tables-catalog,Database=sales" \
365+
--result-configuration "OutputLocation=s3://athena-doc-output/results/"
366+
```
367+
368+
```bash
369+
awslocal athena start-query-execution \
370+
--query-string "SELECT * FROM orders ORDER BY id" \
371+
--query-execution-context "Catalog=s3tables-catalog,Database=sales" \
372+
--result-configuration "OutputLocation=s3://athena-doc-output/results/"
373+
```
374+
375+
You can also use the catalog-id reference (`s3tablescatalog/<bucket-name>`) directly in `QueryExecutionContext.Catalog` if you prefer not to register a named Athena data catalog.
376+
377+
:::note
378+
Query execution against the federated catalog routes through Trino's Iceberg connector inside the LocalStack bigdata container.
379+
The first query may take several minutes while LocalStack downloads and starts the bigdata dependencies.
380+
Subsequent queries reuse the running services.
381+
:::
382+
221383
## Client configuration
222384
223385
You can configure the Athena service in LocalStack with various clients, such as [PyAthena](https://github.com/laughingman7743/PyAthena/), [awswrangler](https://github.com/aws/aws-sdk-pandas), among others!

src/content/docs/aws/services/s3tables.mdx

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -164,6 +164,13 @@ awslocal s3tables list-tables \
164164
}
165165
```
166166

167+
## Querying S3 Tables from Athena
168+
169+
LocalStack [Athena](/aws/services/athena/) can query S3 Tables data through a Glue federated catalog.
170+
Once you register a federated `s3tablescatalog` in Glue and add a matching Athena data catalog, you can run SQL against your S3 Tables namespaces and tables directly from Athena.
171+
172+
See [S3 Tables in the Athena documentation](/aws/services/athena/#s3-tables) for the full workflow.
173+
167174
## API Coverage
168175

169176
<FeatureCoverage service="s3tables" client:load />

0 commit comments

Comments
 (0)