Proxies résidentiels

Proxy résidentiels statiques

Proxy résidentiels illimités

Proxys YouTube

Proxies résidentiels

Agent résidentiel statique

Proxy résidentiels illimités

Données pour l'IA

Collecte de données sur le web

SEO et scraping SERP

Suivi des prix

Agrégation des tarifs de voyage

Collecte de données sur le marché boursier

Tous les emplacements

Partenaires de Swiftproxy

Collectez des données à grande échelle

Proxies de Web Scraping Essai gratuit

Collectez des données précises dans le monde entier sans blocages ni interruptions.

Solution de proxy à bande passante illimitée pour la collecte de données vidéo à grande échelle

Boostez la croissance de votre entreprise avec Swiftproxy

Un réseau mondial de plus de 80 millions de proxies résidentiels, assurant une disponibilité de 99,89 % et des connexions stables, prenant en charge les protocoles HTTP(S) et SOCKS5.

Swiftproxy residential proxies with 80M+ IPs, 99.89% uptime, supporting HTTP(S) & SOCKS5 protocols

Programme d'affiliation

30% Commission garantie

Gains CDK

Proxies en profits

Scraping Yahoo Finance Data for Real-Time Stock Insights with Python

By - Emily Chan

2025-01-07 14:49:58

Imagine being able to access real-time financial data with just a few lines of code. What if you could automatically track stock prices, market trends, and other key metrics from Yahoo Finance without manually refreshing a page? You can. This blog walks you through scraping Yahoo Finance using Python—no need for a deep dive into APIs or complex setups.
Let's cut to the chase. The financial world moves fast, and having the ability to extract and analyze key data in real-time is a game-changer for market analysts, traders, and anyone looking to stay ahead of the curve. Yahoo Finance holds a treasure trove of financial data—from stock prices to market news—and with Python, you can automate the whole process.

Why Scraping Yahoo Finance Matters

Yahoo Finance provides a wide range of data: live stock prices, historical charts, market trends, and more. This data is gold when you’re building financial models, developing trading algorithms, or conducting investment analysis. Scraping it allows you to bypass waiting on updates or relying on third-party APIs. And the best part? You own the data once you've got it.

Tools You'll Need

To make this happen, you'll need two Python libraries:

requests – for sending HTTP requests and retrieving web content.

lxml – for parsing the HTML content and extracting data using XPath.
Before jumping into the code, make sure you have these libraries installed:

pip install requests  
pip install lxml

Step-by-Step Guide to Scraping Yahoo Finance

Step 1: Send an HTTP Request to Fetch Data

The first thing you'll need is to send an HTTP request to Yahoo Finance's stock page. We'll use requests for this. But here’s the catch: to avoid getting flagged as a bot, you need to send a request with headers that mimic a real browser request.
Here's the Python code to do just that:

import requests  
from lxml import html  

# URL of the stock page you want to scrape  
url = "https://finance.yahoo.com/quote/AMZN/"  

# Headers to simulate a real browser  
headers = {  
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'  
}  

# Send the HTTP request  
response = requests.get(url, headers=headers)

Including headers like User-Agent makes it harder for Yahoo Finance's anti-bot measures to detect that you're scraping. It helps you avoid detection by mimicking normal web traffic.

Step 2: Parse the HTML and Extract Data Using XPath

Once you've fetched the page, you need to parse it and extract the data you need. We'll use XPath for this. XPath allows you to target specific parts of the HTML document—like a live stock price, trading volume, or the day's high and low.
Here's the code to extract key data points from the page:

# Parse the HTML content  
parser = html.fromstring(response.content)  

# Extract data using XPath  
title = parser.xpath('//h1[@class="yf-3a2v0c"]/text()')[0]  
live_price = parser.xpath('//fin-streamer[@class="livePrice yf-mgkamr"]/span/text()')[0]  
date_time = parser.xpath('//div[@slot="marketTimeNotice"]/span/text()')[0]  
open_price = parser.xpath('//ul[@class="yf-tx3nkj"]/li[2]/span[2]/fin-streamer/text()')[0]  
previous_close = parser.xpath('//ul[@class="yf-tx3nkj"]/li[1]/span[2]/fin-streamer/text()')[0]  
days_range = parser.xpath('//ul[@class="yf-tx3nkj"]/li[5]/span[2]/fin-streamer/text()')[0]  
week_52_range = parser.xpath('//ul[@class="yf-tx3nkj"]/li[6]/span[2]/fin-streamer/text()')[0]  
volume = parser.xpath('//ul[@class="yf-tx3nkj"]/li[7]/span[2]/fin-streamer/text()')[0]  
avg_volume = parser.xpath('//ul[@class="yf-tx3nkj"]/li[8]/span[2]/fin-streamer/text()')[0]

This code will pull the stock title, live price, date and time of the last trade, and other key metrics. It's a quick and efficient way to get a snapshot of stock data.

Step 3: Handle Anti-Bot Measures

Websites like Yahoo Finance often block scrapers. To bypass this, you can use proxies and rotate your headers.
Using Proxies: Proxies help mask your real IP address, making it harder for the website to detect automated scraping.
Here's how you can use a proxy:

proxies = {  
    "http": "http://your.proxy.server:port",  
    "https": "https://your.proxy.server:port"  
}  

response = requests.get(url, headers=headers, proxies=proxies)

Rotating Headers: If you want to go a step further, rotate your User-Agent header for each request. This mimics requests from different browsers and makes you harder to detect.
Here's how you can rotate headers:

import random  

user_agents = [  
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3",  
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:76.0) Gecko/20100101 Firefox/76.0",  
    # Add more User-Agent strings here  
]  

headers["User-Agent"] = random.choice(user_agents)  

response = requests.get(url, headers=headers)

Step 4: Save the Data for Later

Once you've extracted the data, you'll likely want to save it for analysis. The simplest way is to write it to a CSV file.
Here's how you can save your scraped data:

import csv  

# Data to be saved  
data = [  
    ["URL", "Title", "Live Price", "Date & Time", "Open Price", "Previous Close", "Day's Range", "52 Week Range", "Volume", "Avg. Volume"],  
    [url, title, live_price, date_time, open_price, previous_close, days_range, week_52_range, volume, avg_volume]  
]  

# Save to CSV file  
with open("yahoo_finance_data.csv", "w", newline="") as file:  
    writer = csv.writer(file)  
    writer.writerows(data)  

print("Data saved to yahoo_finance_data.csv")

Putting It All Together

Here's the full script that integrates everything you've learned:

import requests
from lxml import html
import random
import csv

# URL to scrape
url = "https://finance.yahoo.com/quote/AMZN/"

# Headers for rotating User-Agent
user_agents = [
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3",
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:76.0) Gecko/20100101 Firefox/76.0",
]

headers = {
    'User-Agent': random.choice(user_agents)
}

# Optional Proxy
proxies = {
    "http": "http://your.proxy.server:port",
    "https": "https://your.proxy.server:port"
}

# Send request with headers and proxies
response = requests.get(url, headers=headers, proxies=proxies)

if response.status_code == 200:
    parser = html.fromstring(response.content)

    # Extract data using XPath
    title = parser.xpath('//h1[@class="yf-3a2v0c"]/text()')[0]
    live_price = parser.xpath('//fin-streamer[@class="livePrice yf-mgkamr"]/span/text()')[0]
    date_time = parser.xpath('//div[@slot="marketTimeNotice"]/span/text()')[0]
    open_price = parser.xpath('//ul[@class="yf-tx3nkj"]/li[2]/span[2]/fin-streamer/text()')[0]
    previous_close = parser.xpath('//ul[@class="yf-tx3nkj"]/li[1]/span[2]/fin-streamer/text()')[0]
    days_range = parser.xpath('//ul[@class="yf-tx3nkj"]/li[5]/span[2]/fin-streamer/text()')[0]
    week_52_range = parser.xpath('//ul[@class="yf-tx3nkj"]/li[6]/span[2]/fin-streamer/text()')[0]
    volume = parser.xpath('//ul[@class="yf-tx3nkj"]/li[7]/span[2]/fin-streamer/text()')[0]
    avg_volume = parser.xpath('//ul[@class="yf-tx3nkj"]/li[8]/span[2]/fin-streamer/text()')[0]

    # Print data
    print(f"Title: {title}")
    print(f"Live Price: {live_price}")
    print(f"Date & Time: {date_time}")
    print(f"Open Price:

 {open_price}")
    print(f"Previous Close: {previous_close}")
    print(f"Day's Range: {days_range}")
    print(f"52 Week Range: {week_52_range}")
    print(f"Volume: {volume}")
    print(f"Avg. Volume: {avg_volume}")

    # Save data to CSV
    data = [
        ["URL", "Title", "Live Price", "Date & Time", "Open Price", "Previous Close", "Day's Range", "52 Week Range", "Volume", "Avg. Volume"],
        [url, title, live_price, date_time, open_price, previous_close, days_range, week_52_range, volume, avg_volume]
    ]

    with open("yahoo_finance_data.csv", "w", newline="") as file:
        writer = csv.writer(file)
        writer.writerows(data)

    print("Data saved to yahoo_finance_data.csv")
else:
    print(f"Failed to retrieve data. Status code: {response.status_code}")

Conclusion

Scraping data from Yahoo Finance with Python is a simple, efficient way to automate the collection of financial data. By mastering requests, lxml, and proper scraping techniques like rotating headers and using proxies, you can reliably pull in key metrics for analysis. Remember, while this method is powerful, always adhere to legal and ethical guidelines when scraping.

Note sur l'auteur

Emily Chan

Rédactrice en chef chez Swiftproxy

Emily Chan est la rédactrice en chef chez Swiftproxy, avec plus de dix ans d'expérience dans la technologie, les infrastructures numériques et la communication stratégique. Basée à Hong Kong, elle combine une connaissance régionale approfondie avec une voix claire et pratique pour aider les entreprises à naviguer dans le monde en évolution des solutions proxy et de la croissance basée sur les données.

Le contenu fourni sur le blog Swiftproxy est destiné uniquement à des fins d'information et est présenté sans aucune garantie. Swiftproxy ne garantit pas l'exactitude, l'exhaustivité ou la conformité légale des informations contenues, ni n'assume de responsabilité pour le contenu des sites tiers référencés dans le blog. Avant d'engager toute activité de scraping web ou de collecte automatisée de données, il est fortement conseillé aux lecteurs de consulter un conseiller juridique qualifié et de revoir les conditions d'utilisation applicables du site cible. Dans certains cas, une autorisation explicite ou un permis de scraping peut être requis.

Dans cet article

Solutions proxy résidentielles de haut niveau

Accédez à plus de 90 millions d'IP résidentiels avec une fiabilité élevée et des temps de réponse rapides.

Essai gratuit

FAQ

Charger plus

Afficher moins

Chat with SwiftProxy support via Telegram

Contactez-nous avec un email

[email protected]

Tips

Veuillez fournir votre numéro de compte ou votre adresse courriel.
Fournissez des vidéos ou des captures d'écran et décrivez simplement les problèmes auxquels vous êtes confronté.
Notre personnel répondra à votre message dans les 24 heures.

Scraping Yahoo Finance Data for Real-Time Stock Insights with Python

Why Scraping Yahoo Finance Matters

Tools You'll Need

Step-by-Step Guide to Scraping Yahoo Finance

Step 1: Send an HTTP Request to Fetch Data

Step 2: Parse the HTML and Extract Data Using XPath

Step 3: Handle Anti-Bot Measures

Step 4: Save the Data for Later

Putting It All Together

Conclusion

Note sur l'auteur

Articles liés