
When scraping data, proxies serve as a crucial tool. They help bypass rate limits, mask identities, and prevent blocks. For those using Python, the Requests module simplifies working with proxies.
This guide outlines how to use proxies with Requests in Python, covering everything from basic setups to advanced proxy rotation. Whether new to scraping or looking to improve, this post provides valuable tips and highlights common mistakes to avoid.
First things first—make sure you've installed the Requests library. If you haven't, you can quickly get started with:
python3 -m pip install requests
Now, let’s perform a simple HTTP request using a proxy. Here’s the basic syntax:
import requests
http_proxy = "http://130.61.171.71:3128"
proxies = {
"http": http_proxy,
"https": http_proxy,
}
resp = requests.get("https://ifconfig.me/ip", proxies=proxies)
print(resp, resp.text)
What's happening here?
We're simply defining a proxy, passing it to the proxies dictionary, and making a request. In this case, the output should be an IP address from the proxy:
$ python3 main.py
<Response [200]>130.61.171.71
When encountering this structure, it can be confusing. Why specify HTTP for both the protocol and the value? The proxies dictionary maps each protocol (like HTTP or HTTPS) to its proxy. Although it may seem redundant, this structure is clear and efficient, especially as the scale increases.
There are several types of proxy connections:
HTTP Proxy – Fast but insecure. Use it for handling bulk requests.
HTTPS Proxy – Adds encryption, but a bit slower due to SSL/TLS overhead.
SOCKS5 Proxy – The most flexible. Use it if you need to connect to services beyond HTTP/HTTPS (like FTP or even the Tor network).
For SOCKS5 proxies, you'll need the requests[socks] extension. Install it with:
python3 -m pip install requests[socks]
Then, you can use SOCKS5 like this:
import requests
username = "yourusername"
password = "yourpassword"
socks5_proxy = f"socks5://{username}:{password}@proxyserver.com:1080"
proxies = {
"http": socks5_proxy,
"https": socks5_proxy,
}
resp = requests.get("https://ifconfig.me", proxies=proxies)
print(resp, resp.text)
Most paid proxy services require authentication. Here's a simple way to authenticate using basic credentials:
username = "yourusername"
password = "yourpassword"
proxies = {
"http": f"http://{username}:{password}@proxyserver.com:1080",
"https": f"https://{username}:{password}@proxyserver.com:443",
}
Sometimes, it's more convenient to use environment variables to set your proxies. Here's how:
$ export HTTP_PROXY='http://yourusername:[email protected]:1080'
$ export HTTPS_PROXY='https://yourusername:[email protected]:443'
$ export NO_PROXY='localhost,127.0.0.1'
$ python3
>>> import requests
>>> resp = requests.get("https://ifconfig.me/ip")
>>> print(resp.text)
186.188.228.86
These environment variables allow you to set default proxies without hardcoding them in your script.
Want to set default configurations like headers, timeouts, or proxies? Use a Session object:
import requests
proxies = {
"http": "http://username:[email protected]:1080",
"https": "https://username:[email protected]:443",
}
session = requests.Session()
session.proxies.update(proxies)
resp = session.get("https://ifconfig.me")
print(resp.text)
Sessions are helpful when you're scraping a website that requires cookies or headers. Plus, they avoid the need to reconfigure proxies with every request.
Let's say you're scraping a heavily-protected site. Rotating proxies is the solution. Here's how you can rotate through a list of proxies manually:
import random
import requests
proxies_list = [
"http://proxy1.com:8080",
"http://proxy2.com:8080",
"http://proxy3.com:8080",
]
for _ in range(10):
proxy = random.choice(proxies_list)
proxies = {"https": proxy}
resp = requests.get("https://ifconfig.me/ip", proxies=proxies)
print(resp.text)
But if you want something more seamless, a professional provider like swiftproxy allows automatic IP rotation per request. Here's a quick example:
import requests
proxies = {
"http": "http://username:[email protected]:1080",
"https": "http://username:[email protected]:1080",
}
for _ in range(10):
resp = requests.get("https://ifconfig.me/ip", proxies=proxies)
print(resp.text)
Each request will come from a different IP.
Sticky proxies are useful when you need to keep the same IP across multiple requests. For example, if you're scraping a login-required page and want to avoid IP mismatches.
Rotating proxies give you a new IP for every request, which is ideal for bypassing rate limits and anti-bot protections.
Here's an example using sticky proxies:
import requests
from uuid import uuid4
def sticky_proxies_demo():
sessions = [uuid4().hex[:6] for _ in range(2)]
for i in range(10):
session = sessions[i % len(sessions)]
http_proxy = f"http://username,session_{session}:[email protected]:1080"
proxies = {"http": http_proxy, "https": http_proxy}
resp = requests.get("https://ifconfig.me/ip", proxies=proxies)
print(f"Session {session}: {resp.text}")
sticky_proxies_demo()
Here are some tips for handling network-related errors, like ProxyError or TimeoutError:
Retries: Implement a retry mechanism. Use requests' built-in retry functionality via a Session object.
SSLError: If you're running into SSL issues, you can disable verification with verify=False. Just be aware that you'll see security warnings:
import requests
import urllib3
urllib3.disable_warnings()
resp = requests.get("https://ifconfig.me/ip", proxies=proxies, verify=False)
print(resp.text)
By following these guidelines, you can save time, avoid costly mistakes, and build more efficient and reliable scraping scripts. Proxies may be tricky at first, but with these actionable insights, you will become proficient quickly.