利用 Web Unlocker API 实现亚马逊数据的高效采集

在数据驱动决策的时代，电商平台的海量数据至关重要。然而，像亚马逊这样的巨头构建了完善的反爬虫防线，包括 IP 封锁、CAPTCHA 验证和浏览器指纹识别，常规爬虫工具往往难以应对。

Web Unlocker API 提供了一种自动化解决方案，能够处理复杂的网站解锁操作。它基于代理基础设施，包含请求管理、浏览器指纹伪装和内容验证三大组件，可自动处理 CAPTCHA、指纹处理和请求头优化。与常规代理不同，只需发送一个 API 请求，系统即可返回干净的 HTML 或 JSON 响应。

一、Web Unlocker API 简介

该工具的核心优势在于智能算法能无缝管理寻找最佳代理网络、定制请求头及动态验证码解决过程。当面对高防网站时，这些功能尤为关键。

配置界面示例

二、开始使用 Web Unlocker API

1. 控制台配置

首先登录控制台，在左侧导航栏选择'代理 & 抓取基础设施'，找到'网页解锁器'并点击创建。

控制台入口

2. 创建通道

进入页面后，填写通道名称并添加简短描述，确认后即可生成新的解锁实例。

创建通道

3. 查看详细信息

创建完成后，系统会展示该实例的详细信息，包括主机地址、端口、用户名和密码等凭证。

实例详情

4. 配置参数

针对复杂网站，建议启用动态住宅 IP、自定义指纹和 Cookie 支持，以最大化绕过检测的成功率。

高级配置

5. Python 脚本示例

下面以获取亚马逊搜索数据为例，演示如何集成该 API。

（1）定位目标

在亚马逊搜索'gaming'，复制当前 URL。我们需要解析的结果通常包含 ASIN、标题、价格、评分等信息。

搜索结果页

import requests from bs4 import BeautifulSoup import pandas as pd import warnings # 忽略 SSL 警告 warnings.filterwarnings('ignore', message='Unverified HTTPS request') # 您的 Bright Data 凭证 customer_id = "brd-customer-hl_da15f828" zone_name = "web_unlocker3" zone_password = "q9crj4rw9004" # 代理设置 proxy_url = "brd.superproxy.io:33335" proxy_auth = f"brd-customer-{customer_id}-zone-{zone_name}:{zone_password}" proxies = { "http": f"http://{proxy_auth}@{proxy_url}", "https": f"http://{proxy_auth}@{proxy_url}" } # 目标亚马逊搜索 URL target_url = "https://www.amazon.com/s?k=gaming&language=zh&_encoding=UTF8" # 模拟真实浏览器请求头 headers = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36", "Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8", "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8", "Referer": "https://www.amazon.com/" } try: print("正在发送请求...") response = requests.get(target_url, proxies=proxies, headers=headers, verify=False) print(f"请求状态码：{response.status_code}") # 保存原始 HTML with open("amazon_gaming_search.html", "w", encoding="utf-8") as file: file.write(response.text) print("HTML 已保存到 amazon_gaming_search.html") # 解析结果 soup = BeautifulSoup(response.text, "html.parser") search_results = [] # 选择器定位产品卡片 product_cards = soup.select(".s-result-item[data-asin]:not([data-asin=''])") print(f"找到 {len(product_cards)} 个产品") for card in product_cards: asin = card.get("data-asin") try: title_element = card.select_one("h2 a span") title = title_element.text.strip() if title_element else "N/A" price_element = card.select_one(".a-price .a-offscreen") price = price_element.text.strip() if price_element else "N/A" rating_element = card.select_one(".a-icon-star-small") rating = rating_element.text.strip() if rating_element else "N/A" reviews_element = card.select_one("span.a-size-base.s-underline-text") reviews = reviews_element.text.strip() if reviews_element else "N/A" search_results.append({ "asin": asin, "title": title, "price": price, "rating": rating, "reviews": reviews, "url": f"https://www.amazon.com/dp/{asin}" }) print(f"已解析：{title[:30]}...") except Exception as e: print(f"解析产品 {asin} 时出错：{str(e)}") # 导出 CSV if search_results: df = pd.DataFrame(search_results) df.to_csv("amazon_gaming_search_results.csv", index=False, encoding="utf-8-sig") print(f"成功抓取 {len(search_results)} 条数据，已保存至 CSV") print(df.head()) else: print("未找到有效搜索结果") except Exception as e: print(f"请求失败：{str(e)}")

利用 Web Unlocker API 实现亚马逊数据的高效采集

一、Web Unlocker API 简介

二、开始使用 Web Unlocker API

1. 控制台配置

2. 创建通道

3. 查看详细信息

4. 配置参数

5. Python 脚本示例

更多推荐文章

6. 运行结果

三、Web Scraper 方案

1. 快速上手

2. 代码集成示例

四、SERP API

五、总结

更多推荐文章

相关免费在线工具

利用 Web Unlocker API 实现亚马逊数据的高效采集

一、Web Unlocker API 简介

二、开始使用 Web Unlocker API

1. 控制台配置

2. 创建通道

3. 查看详细信息

4. 配置参数

5. Python 脚本示例

微信扫一扫，关注极客日志

更多推荐文章

6. 运行结果

三、Web Scraper 方案

1. 快速上手

2. 代码集成示例

四、SERP API

五、总结

微信扫一扫，关注极客日志

更多推荐文章

相关免费在线工具