Athena 使用

资源

资源名称
作用

athena-query-2023

存放 Athena Query

xushengjin-server

S3 静态网站并开启 Server access logging

s3-service-access-loging

存放 xushengjin-server 的日志

  1. 创建一个存储桶用于 Athena 存放查询报告

image-20230422234402871
  1. 设置查询结果位置

image-20230422234607047
  1. 选择第一步创建的存储桶

image-20230422234851557
  1. 创建数据库

    create database s3_access_logs_db
    image-20230422235016151
  2. 创建 s3_access_logs_db.mybucket_logs

    CREATE EXTERNAL TABLE IF NOT EXISTS s3_access_logs_db.mybucket_logs(
             BucketOwner STRING,
             Bucket STRING,
             RequestDateTime STRING,
             RemoteIP STRING,
             Requester STRING,
             RequestID STRING,
             Operation STRING,
             Key STRING,
             RequestURI_operation STRING,
             RequestURI_key STRING,
             RequestURI_httpProtoversion STRING,
             HTTPstatus STRING,
             ErrorCode STRING,
             BytesSent BIGINT,
             ObjectSize BIGINT,
             TotalTime STRING,
             TurnAroundTime STRING,
             Referrer STRING,
             UserAgent STRING,
             VersionId STRING,
             HostId STRING,
             SigV STRING,
             CipherSuite STRING,
             AuthType STRING,
             EndPoint STRING,
             TLSVersion STRING
    ) 
    ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
    WITH SERDEPROPERTIES (
             'serialization.format' = '1', 'input.regex' = '([^ ]*) ([^ ]*) \\[(.*?)\\] ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) \\\"([^ ]*) ([^ ]*) (- |[^ ]*)\\\" (-|[0-9]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) (\"[^\"]*\") ([^ ]*)(?: ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*))?.*$' )
    LOCATION 's3://s3-service-access-loging/';
    image-20230423111834746
  3. 查询请求类型和状态码数量

    SELECT requesturi_operation, httpstatus, count(*) 
    	FROM "s3_access_logs_db"."mybucket_logs" 
    	GROUP BY requesturi_operation, httpstatus;
    image-20230423112002481
  4. 查询状态码 403 的具体请求信息

    SELECT * FROM "s3_access_logs_db"."mybucket_logs"
    	where httpstatus='403';
    image-20230423112428019

    可以 Download results 将查询结果下载到本地

参考

https://repost.aws/zh-Hans/knowledge-center/analyze-logs-athena

Last updated