HTTP Proxy Scanner: Find Free and Working Proxies

Written by

in

Building an HTTP proxy scanner is an excellent project for understanding network protocols, concurrency, and proxy mechanics. This guide will walk you through creating a functional, concurrent HTTP proxy scanner in Python.

By the end of this article, you will have a script that takes a list of potential proxy servers, tests their connectivity, and verifies if they successfully mask your IP address. Understanding the Architecture

An HTTP proxy scanner operates on a simple fetch-and-verify mechanism.

[Scanner] —> [Candidate Proxy] —> [Judge Server] ^ | |____________ Returns public IP ________|

The Target List: A collection of IP addresses and ports to test.

The Proxy Judge: A reliable, public server that echoes back the requester’s IP address (e.g., http://httpbin.org).

The Concurrency Engine: A system to test hundreds of proxies simultaneously so the script does not stall on dead connections. Step 1: Setting Up the Environment

We will use Python 3 and the requests library for handling HTTP traffic. Because standard sequential requests are too slow for scanning, we will use Python’s built-in concurrent.futures module to handle simultaneous connections. First, ensure you have the required library installed: pip install requests Use code with caution. Step 2: Designing the Core Verification Logic

The core function must attempt to route a request through a specific proxy to a “proxy judge.” If the judge returns the proxy’s IP instead of your home IP, the proxy works.

import requests def check_proxy(proxy_address): “”” Tests a single proxy. Format expected: ‘ip:port’ (e.g., ‘192.168.1.50:8080’) “”” proxy_dict = { “http”: f”http://{proxy_address}“, “https”: f”http://{proxy_address}” } # Using httpbin.org to echo back the IP address seen by the server judge_url = “http://httpbin.org” try: # Low timeout is critical; dead proxies shouldn’t hang your script response = requests.get(judge_url, proxies=proxy_dict, timeout=5) if response.status_code == 200: # Verify the judge actually saw the proxy IP, confirming anonymity returned_ip = response.json().get(“origin”) if returned_ip and proxy_address.split(‘:’)[0] in returned_ip: return {“proxy”: proxy_address, “status”: “Working”, “speed_ms”: response.elapsed.total_seconds()1000} except requests.RequestException: # Captures timeouts, connection drops, and bad handshakes pass return {“proxy”: proxy_address, “status”: “Dead”, “speed_ms”: None} Use code with caution. Step 3: Implementing Concurrency

If you have 1,000 proxies to scan and each timeout takes 5 seconds, a single-threaded scanner would take over an hour. By using a thread pool, we can process dozens of proxies at the same time.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *