Athena 使用
资源
资源名称
作用
athena-query-2023
存放 Athena Query
xushengjin-server
S3 静态网站并开启 Server access logging
s3-service-access-loging
存放 xushengjin-server 的日志
创建一个存储桶用于 Athena 存放查询报告

设置查询结果位置

选择第一步创建的存储桶

创建数据库
create database s3_access_logs_db
image-20230422235016151 创建
s3_access_logs_db.mybucket_logs表CREATE EXTERNAL TABLE IF NOT EXISTS s3_access_logs_db.mybucket_logs( BucketOwner STRING, Bucket STRING, RequestDateTime STRING, RemoteIP STRING, Requester STRING, RequestID STRING, Operation STRING, Key STRING, RequestURI_operation STRING, RequestURI_key STRING, RequestURI_httpProtoversion STRING, HTTPstatus STRING, ErrorCode STRING, BytesSent BIGINT, ObjectSize BIGINT, TotalTime STRING, TurnAroundTime STRING, Referrer STRING, UserAgent STRING, VersionId STRING, HostId STRING, SigV STRING, CipherSuite STRING, AuthType STRING, EndPoint STRING, TLSVersion STRING ) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe' WITH SERDEPROPERTIES ( 'serialization.format' = '1', 'input.regex' = '([^ ]*) ([^ ]*) \\[(.*?)\\] ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) \\\"([^ ]*) ([^ ]*) (- |[^ ]*)\\\" (-|[0-9]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) (\"[^\"]*\") ([^ ]*)(?: ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*))?.*$' ) LOCATION 's3://s3-service-access-loging/';
image-20230423111834746 查询请求类型和状态码数量
SELECT requesturi_operation, httpstatus, count(*) FROM "s3_access_logs_db"."mybucket_logs" GROUP BY requesturi_operation, httpstatus;
image-20230423112002481 查询状态码
403的具体请求信息SELECT * FROM "s3_access_logs_db"."mybucket_logs" where httpstatus='403';
image-20230423112428019 可以 Download results 将查询结果下载到本地
参考
https://repost.aws/zh-Hans/knowledge-center/analyze-logs-athena
Last updated