基于 Python 解析 HAR 文件生成页面性能测试报告

通过 Python 解析 HAR 文件并自动生成可视化的页面性能测试报告，下面提供一套完整、可直接落地的实现方案，涵盖 HAR 解析、性能指标提取、HTML 报告生成全流程。

一、方案概述

HAR 文件说明：HAR（HTTP Archive）是一种 JSON 格式的文件，用于记录浏览器与服务器之间的 HTTP 请求/响应详情及页面加载性能数据，可通过 Chrome、Firefox 等浏览器的开发者工具获取。
核心依赖库：
- haralyzer：专门用于解析 HAR 文件，提取性能指标和请求数据，简化手动解析 JSON 的复杂度。
- jinja2：轻量级模板引擎，用于快速渲染 HTML 报告，支持自定义报告样式和结构。
- json：Python 内置库，辅助读取和解析 HAR 文件（haralyzer底层也依赖此库）。
实现流程：获取 HAR 文件 → Python 解析 HAR → 提取关键性能指标 → 渲染 HTML 模板 → 生成可视化报告。

二、环境准备

先安装所需依赖库，执行以下命令：

pip install haralyzer jinja2

三、关键步骤实现

步骤 1：获取 HAR 文件（浏览器操作）

以 Chrome 浏览器为例，获取页面 HAR 文件：

打开 Chrome 浏览器，访问需要测试的页面。
按 F12 打开开发者工具，切换到 Network 面板。
勾选面板左上角的 Preserve log（保留日志）。
刷新页面（或重新加载目标页面），等待所有请求加载完成。
右键点击 Network 面板中的任意请求，选择 Save all as HAR with content，保存为 .har 格式文件（如 page_perf.har）。

步骤 2：Python 解析 HAR 文件并提取性能指标

编写 Python 脚本，解析 HAR 文件，提取核心页面性能指标（如页面加载时间、DNS 查询时间、TTFB 等）和请求详情。

import json
from haralyzer import HarParser, HarPage
from datetime import datetime

def parse_har_file(har_file_path):
    """ 解析 HAR 文件，提取页面性能指标和请求数据
    :param har_file_path: HAR 文件路径
    :return: 性能指标字典 + 详细请求列表
    """
    # 读取 HAR 文件
    try:
        with open(har_file_path, 'r', encoding='utf-8')  f:
            har_data = json.load(f)
     FileNotFoundError:
         Exception()
     json.JSONDecodeError:
         Exception()
    
    
    har_parser = HarParser(har_data)
    
    
    har_pages = har_parser.pages
      har_pages:
         Exception()
    target_page: HarPage = har_pages[]
    
    
    perf_metrics = {
        : target_page.page_title  ,
        : datetime.fromtimestamp((target_page.start_time.timestamp())).strftime(),
        : (target_page.page_load_time, ),  
        : (target_page.dns_time, ),  
        : (target_page.tcp_time, ),  
        : (target_page.ttfb, ),  
        : (target_page.send_time, ),  
        : (target_page.receive_time, ),  
        : (target_page.ssl_time + target_page.send_time + target_page.receive_time, ),
        : (target_page.requests),  
        : ([req  req  target_page.requests  req.status_code >= ]),  
        : ([req  req  target_page.requests  req.resource_type  [, , ]])
    }
    
    
    request_details = []
     req  target_page.requests:
        request_info = {
            : req.url,
            : req.method,
            : req.status_code,
            : req.resource_type  ,
            : (req.time, ),
            : (req.dns_time, ),
            : (req.tcp_time, ),
            : (req.ttfb, ),
            : req.response_size  
        }
        request_details.append(request_info)
     perf_metrics, request_details

<!DOCTYPE html> <html lang="zh-CN"> <head> <meta charset="UTF-8"> <title>{{ perf_metrics.页面名称 }} - 性能测试报告</title> <style> * { margin: 0; padding: 0; box-sizing: border-box; font-family: "Microsoft YaHei", sans-serif; } .container { width: 95%; margin: 20px auto; } .report-header { text-align: center; padding: 15px; background-color: #f5f5f5; border-radius: 5px; margin-bottom: 20px; } .metrics-card { background-color: #f8f9fa; padding: 20px; border-radius: 5px; margin-bottom: 20px; box-shadow: 0 2px 4px rgba(0,0,0,0.1); } .metrics-card h2 { color: #2c3e50; margin-bottom: 15px; border-bottom: 2px solid #3498db; padding-bottom: 10px; } .metrics-table { width: 100%; border-collapse: collapse; margin-top: 10px; } .metrics-table th, .metrics-table td { border: 1px solid #ddd; padding: 12px; text-align: left; } .metrics-table th { background-color: #3498db; color: white; font-weight: bold; } .metrics-table tr:nth-child(even) { background-color: #f2f2f2; } .warning { color: #e67e22; font-weight: bold; } .error { color: #e74c3c; font-weight: bold; } </style> </head> <body> <div class="container"> <div class="report-header"> <h1>{{ perf_metrics.页面名称 }} - 页面性能测试报告</h1> <p>生成时间：{{ generate_time }} | 测试时间：{{ perf_metrics.测试时间 }}</p> </div>  <div class="metrics-card"> <h2>一、核心性能指标</h2> <table class="metrics-table"> <thead> <tr> <th>性能指标</th> <th>数值</th> <th>单位</th> <th>说明</th> </tr> </thead> <tbody> <tr><td>页面加载总时间</td><td>{{ perf_metrics.页面加载总时间 }}</td><td>毫秒（ms）</td><td>页面从开始加载到完全渲染完成的总耗时</td></tr> <tr><td>DNS 查询总时间</td><td>{{ perf_metrics.DNS 查询总时间 }}</td><td>毫秒（ms）</td><td>域名解析的总耗时，越小表示解析越快</td></tr> <tr><td>TCP 连接总时间</td><td>{{ perf_metrics.TCP 连接总时间 }}</td><td>毫秒（ms）</td><td>TCP 三次握手的总耗时，反映网络连接效率</td></tr> <tr><td>首字节时间（TTFB）</td><td>{{ perf_metrics.首字节时间（TTFB） }}</td><td>毫秒（ms）</td><td>从请求发送到接收服务器首个字节的时间，反映服务器响应速度</td></tr> <tr><td>总请求数</td><td>{{ perf_metrics.总请求数 }}</td><td>个</td><td>页面加载的所有 HTTP/HTTPS 请求数量</td></tr> <tr><td>失败请求数</td><td class="{% if perf_metrics.失败请求数 > 0 %}error{% endif %}">{{ perf_metrics.失败请求数 }}</td><td>个</td><td>状态码≥400 的异常请求数量，理想值为 0</td></tr> <tr><td>静态资源数（JS/CSS/图片）</td><td>{{ perf_metrics.静态资源数（JS/CSS/图片） }}</td><td>个</td><td>页面加载的核心静态资源数量</td></tr> </tbody> </table> </div>  <div class="metrics-card"> <h2>二、详细请求列表</h2> <table class="metrics-table"> <thead> <tr> <th>请求 URL</th> <th>请求方法</th> <th>状态码</th> <th>资源类型</th> <th>请求耗时（ms）</th> <th>DNS 耗时（ms）</th> <th>TCP 耗时（ms）</th> <th>TTFB（ms）</th> <th>响应大小（B）</th> </tr> </thead> <tbody> {% for req in request_details %} <tr> <td>{{ req.请求 URL }}</td> <td>{{ req.请求方法 }}</td> <td class="{% if req.状态码 >= 400 %}error{% endif %}">{{ req.状态码 }}</td> <td>{{ req.资源类型 }}</td> <td>{{ req.请求耗时（ms） }}</td> <td>{{ req.DNS 耗时（ms） }}</td> <td>{{ req.TCP 耗时（ms） }}</td> <td>{{ req.TTFB（ms） }}</td> <td>{{ req.响应大小（B） }}</td> </tr> {% endfor %} </tbody> </table> </div> </div> </body> </html>

from jinja2 import Environment, FileSystemLoader import os from datetime import datetime def generate_perf_report(har_file_path, output_report_path=None): """ 一键生成页面性能测试 HTML 报告 :param har_file_path: HAR 文件路径 :param output_report_path: 输出报告路径（默认：当前目录 + 页面名称 + _perf_report.html） :return: 生成的报告路径 """ # 1. 解析 HAR 文件 perf_metrics, request_details = parse_har_file(har_file_path) # 2. 配置 Jinja2 模板环境 # 获取当前目录（模板文件需与脚本在同一目录） current_dir = os.path.dirname(os.path.abspath(__file__)) env = Environment(loader=FileSystemLoader(current_dir)) # 加载 HTML 模板 template = env.get_template("perf_report_template.html") # 3. 准备模板渲染数据 render_data = { "perf_metrics": perf_metrics, "request_details": request_details, "generate_time": datetime.now().strftime("%Y-%m-%d %H:%M:%S") } # 4. 确定输出报告路径 if not output_report_path: page_name = perf_metrics["页面名称"].replace("/", "_").replace("\\", "_").replace(":", "_") output_report_path = f"{page_name}_perf_report.html" # 5. 渲染并生成 HTML 报告 with open(output_report_path, 'w', encoding='utf-8') as f: f.write(template.render(render_data)) return output_report_path # 主函数：一键执行 if __name__ == "__main__": # 配置参数：HAR 文件路径（替换为你的 HAR 文件路径） HAR_FILE_PATH = "page_perf.har" # 本地 HAR 文件 # 可选：指定输出报告路径 # OUTPUT_REPORT_PATH = "my_page_perf_report.html" try: # 生成报告 report_path = generate_perf_report(HAR_FILE_PATH) # report_path = generate_perf_report(HAR_FILE_PATH, OUTPUT_REPORT_PATH) print(f"成功！页面性能测试报告已生成：{os.path.abspath(report_path)}") except Exception as e: print(f"失败：{e}")

性能指标	含义与优化方向
页面加载总时间	页面完全加载耗时，越小越好；优化方向：减少资源体积、开启缓存、懒加载。
DNS 查询时间	域名解析耗时，优化方向：使用 DNS 缓存、选择优质 DNS 服务商、减少域名数量。
TCP 连接时间	TCP 握手耗时，优化方向：使用 HTTP/2/HTTP/3、开启 TCP 复用、缩短网络链路。
TTFB（首字节时间）	服务器响应速度，优化方向：优化服务器接口、使用 CDN、减少后端数据库查询耗时。
失败请求数	异常请求数量，理想值为 0；需排查 4xx/5xx 错误（如接口异常、资源不存在）。
总请求数	请求数量越多，加载越慢；优化方向：资源合并（JS/CSS 合并）、雪碧图、减少冗余请求。

基于 Python 解析 HAR 文件生成页面性能测试报告