Amazon Associates Affiliate System - Setup Guide
Amazon Associates Affiliate System - Setup Guide
Complete documentation for the intelligent Amazon affiliate integration in the Ciclismo Hoy cycling diary website.
Last Updated: November 21, 2025
Product Database Size: 101 products
Daily Rotation: Enabled (7-day cycle)
Table of Contents
- Prerequisites
- Configuration
- Product Database Overview
- Product Categories
- Race Terrain Tags
- Season Tags
- Book Tags
- Priority Scoring System
- Seasonal Filtering Logic
- Daily Rotation Algorithm
- Price Tracking System
- Daily Category Rotation
- Adding New Products
- URL Validation
- Local Testing
- GitHub Actions Automation
- Google Analytics 4 Integration
- Legal Compliance
- Troubleshooting
1. Prerequisites
Amazon Associates Account
- Sign up at Amazon Afiliados España
- Complete the application process:
- Provide website URL:
https://yourdomain.com - Describe your content: “Cycling race coverage and analysis”
- Specify your niche: “Sports & Outdoors”
- Provide website URL:
- Wait for approval (usually 1-3 business days)
- Get your affiliate tag (format:
yourname-21)
Google Analytics 4 Account
- Create a GA4 property at Google Analytics
- Get your Measurement ID (format:
G-XXXXXXXXXX) - Enable enhanced measurement events
- Create custom events for affiliate tracking (optional, auto-created)
Development Environment
# Ruby and Jekyll
ruby --version # >= 2.7.0
bundle --version
# Python for automation
python3 --version # >= 3.8
pip3 install PyYAML requests beautifulsoup4
2. Configuration
Edit _config.yml
Replace the placeholder values:
# Amazon Affiliate Settings
amazon_affiliate_tag: "yourname-21" # ← REPLACE with your Amazon Associates tag
# UTM Parameters for tracking
amazon_utm_source: "ciclismohoy"
amazon_utm_medium: "affiliate"
# Google Analytics
google_analytics: "G-XXXXXXXXXX" # ← REPLACE with your GA4 Measurement ID
Verification
# Check configuration is valid YAML
bundle exec jekyll build --trace
If there are YAML syntax errors, the build will fail with details.
3. Product Database Overview
Current Database Statistics
- Total Products: 101
- Wheels: 6 (Shimano, Fulcrum)
- Nutrition: 16 (gels, recovery drinks, isotonic)
- Electronics: 15 (GPS computers, lights, radar)
- Clothing: 41 (jerseys, shorts, jackets, accessories)
- Books: 23 (Tour de France, grand tours, training, biographies)
Database Location
Products are stored in _data/amazon-products.yml
Field Structure
| Field | Type | Required | Description | Example |
|---|---|---|---|---|
id |
String | ✅ | Unique product identifier | wheels-shimano-rs100-rear |
name_es |
String | ✅ | Spanish product name | Shimano RS100 Rueda Trasera |
amazon_link |
String | ✅ | Amazon.es URL with tag placeholder | https://amazon.es/dp/B07DDB6WM9?tag=ciclismohoy-21 |
categories |
Array | ✅ | Product categories | [wheels, road, components] |
race_terrain_tags |
Array | ✅ | Compatible race terrains | [all] or [flat, hills-flat] |
season_tags |
Array | ✅ | Suitable seasons | [all-season] or [winter, fall] |
book_tags |
Array | ❌ | Race keywords (books only) | [tour-de-france, tour, tdf] |
priority |
Integer | ✅ | Base scoring priority (1-10) | 7 |
price_tier |
String | ✅ | Price category | budget, mid, premium |
price_eur |
Float | ❌ | Current price in euros | 45.99 |
description_es |
String | ✅ | Short Spanish description (max 100 chars) | Rueda trasera de carretera fiable |
last_checked |
String | ❌ | ISO timestamp of last check | 2025-11-21T09:35:39.532202Z |
failed_checks |
Integer | ✅ | Consecutive failed checks (0-3) | 0 |
isActive |
Boolean | ✅ | Product availability status | true |
Example Product Entry
- id: wheels-shimano-rs100-rear
name_es: "Shimano RS100 Rueda Trasera Carretera"
amazon_link: "https://www.amazon.es/Rueda-Tras-RS100-Pinza-QR-17C/dp/B07DDB6WM9?tag=ciclismohoy-21"
categories: [wheels, road, components]
race_terrain_tags: [all]
season_tags: [all-season]
priority: 5
price_tier: budget
price_eur: 45.99
description_es: "Rueda trasera de carretera fiable para entrenamiento"
last_checked: '2025-11-21T09:35:39.532202Z'
failed_checks: 0
isActive: true
4. Product Categories
Available Categories
| Category | Count | Description | Example Products |
|---|---|---|---|
wheels |
6 | Wheelsets and components | Shimano RS100, Fulcrum Racing Zero |
nutrition |
16 | Sports nutrition products | Energy gels, recovery drinks, isotonic |
energy |
7 | Energy gels and bars | Victory Endurance, 226ERS, HSN |
recovery |
6 | Post-workout recovery | Recovery drinks, protein powders |
hydration |
3 | Isotonic drinks | 226ERS, Powerbar, Crown Sport |
electronics |
15 | GPS and safety devices | Garmin, Wahoo, Bryton, lights, radar |
gps |
12 | GPS cycling computers | Edge 530/840, ELEMNT BOLT/ROAM |
lights |
3 | Bike lights and safety | Sigma Buster, iGPSPORT, Garmin Varia |
clothing |
41 | Cycling apparel | Jerseys, shorts, jackets, gloves |
jersey |
20 | Cycling jerseys | TDF official, Santic, Spiuk, Gore Wear |
shorts |
8 | Cycling shorts/bibs | Danish Endurance, Gore Wear, Spiuk |
winter |
12 | Cold weather gear | Thermal tights, jackets, gloves |
books |
23 | Cycling literature | Tour history, biographies, training |
Category Usage
- Primary category: First in the array (used for badge display)
- Multiple categories: Products can belong to several categories for flexible filtering
- Rotation grouping: Categories determine which day of week products are checked
5. Race Terrain Tags
The system detects race terrain from ProcyclingStats.com using CSS class selectors.
Terrain Detection Mapping
| ProcyclingStats Class | Terrain Tag | Description | Suitable Products |
|---|---|---|---|
.icon.profile.p1 |
flat |
Flat stages | Aero bikes, deep wheels, TT equipment |
.icon.profile.p2 |
hills-flat |
Hilly finish, flat overall | All-rounder bikes, mid-depth wheels |
.icon.profile.p3 |
hills-uphill |
Hilly with uphill finish | Climbing bikes, lightweight wheels |
.icon.profile.p4 |
mountains-flat |
Mountains with flat finish | All-rounder bikes, versatile gear |
.icon.profile.p5 |
mountains-uphill |
Mountain top finish | Lightweight bikes, climbing wheels |
How Terrain is Captured
In scripts/scrape_and_generate.py, the detect_race_terrain() function:
def detect_race_terrain(soup):
"""
Detect race terrain type from ProcyclingStats parcours profile icon.
Returns:
str: One of 'flat', 'hills-flat', 'hills-uphill', 'mountains-flat',
'mountains-uphill', or None
"""
profile_icon = soup.select_one('.icon.profile.p1, .icon.profile.p2, '
'.icon.profile.p3, .icon.profile.p4, '
'.icon.profile.p5')
if profile_icon:
if 'p1' in profile_icon.get('class', []):
return 'flat'
elif 'p2' in profile_icon.get('class', []):
return 'hills-flat'
elif 'p3' in profile_icon.get('class', []):
return 'hills-uphill'
elif 'p4' in profile_icon.get('class', []):
return 'mountains-flat'
elif 'p5' in profile_icon.get('class', []):
return 'mountains-uphill'
return None
This terrain tag is then saved in the frontmatter of each race markdown file:
---
title: "Stage 9 - Tour de France 2025"
race_terrain: "mountains-uphill" # ← Automatically detected
---
Product Configuration
When adding products, specify which terrains they’re suitable for:
# Example: Climbing bike
terrain_tags:
- "hills-uphill"
- "mountains-flat"
- "mountains-uphill"
# Example: Aero bike
terrain_tags:
- "flat"
- "hills-flat"
Terrain Bonus: Products matching the page’s race_terrain receive +5 points in scoring.
6. Season Tags
Products can be tagged for specific seasons based on typical weather conditions in cycling.
Season Mapping
| Season | Months | Typical Races | Product Examples |
|---|---|---|---|
winter |
Nov, Dec, Jan, Feb | Early season training races | Thermal jackets, long bibs, gloves |
spring |
Mar, Apr, May | Paris-Roubaix, Amstel Gold | Spring classics gear, windbreakers |
summer |
Jun, Jul, Aug | Tour de France, Giro, Vuelta | Lightweight jerseys, sunglasses |
fall |
Sep, Oct | Vuelta a España, Il Lombardia | Transitional clothing, arm warmers |
Season Detection
In _includes/amazon-affiliate-block.html, the current season is determined from the page date:
{% assign month = page.date | date: "%m" | plus: 0 %}
{% if month == 11 or month == 12 or month == 1 or month == 2 %}
{% assign current_season = "winter" %}
{% elsif month >= 3 and month <= 5 %}
{% assign current_season = "spring" %}
{% elsif month >= 6 and month <= 8 %}
{% assign current_season = "summer" %}
{% else %}
{% assign current_season = "fall" %}
{% endif %}
Product Configuration
# Example: Winter jacket
season_tags:
- "winter"
- "fall" # Can work in late fall too
# Example: Summer jersey (year-round in warm climates)
season_tags:
- "spring"
- "summer"
- "fall"
Season Bonus: Products matching the current season receive +3 points in scoring.
Important: Leave season_tags empty for year-round products (e.g., energy gels, basic maintenance tools).
7. Book Tags
Books about famous races receive special treatment with the book_tags field.
Supported Book Tags
| Tag | Race | Example Titles |
|---|---|---|
tour |
Tour de France | “The Official Tour de France Records” |
giro |
Giro d’Italia | “Giro d’Italia: The Story of the World’s Most Beautiful Race” |
vuelta |
Vuelta a España | “Vuelta: Inside Spain’s Greatest Race” |
roubaix |
Paris-Roubaix | “Paris-Roubaix: A Journey Through Hell” |
monuments |
Classic Monuments | “The Monuments: The Grit and the Glory” |
biography |
Rider biographies | “Lance Armstrong: Every Second Counts” (if applicable) |
Book Detection Logic
In _includes/amazon-affiliate-block.html:
{% assign is_famous_race = false %}
{% assign page_title_lower = page.title | downcase %}
{% if page_title_lower contains "tour" or
page_title_lower contains "giro" or
page_title_lower contains "vuelta" or
page_title_lower contains "roubaix" %}
{% assign is_famous_race = true %}
{% endif %}
Product Configuration
# Example: Tour de France book
- id: "book-tour-de-france-official"
name: "Tour de France: The Official History"
category: "books"
book_tags:
- "tour"
priority: 10 # High priority for books
Book Bonus:
- Books matching race keywords receive +10 points
- Books are always shown when
is_famous_race = true(guaranteed slot in top 3)
8. Priority Scoring System
Each product receives a dynamic score based on multiple factors.
Scoring Formula
TOTAL_SCORE = base_priority + terrain_bonus + season_bonus + book_bonus
Where:
- base_priority = product.priority (1-10)
- terrain_bonus = +5 if terrain matches, 0 otherwise
- season_bonus = +3 if season matches, 0 otherwise
- book_bonus = +10 if book tag matches race, 0 otherwise
Scoring Examples
Example 1: Climbing bike on mountain stage in summer
Product:
priority: 8
terrain_tags: ["mountains-uphill"]
season_tags: ["spring", "summer", "fall"]
Page:
race_terrain: "mountains-uphill"
date: "2025-07-15" (summer)
Calculation:
8 (base) + 5 (terrain match) + 3 (season match) + 0 (not a book) = 16 points
Example 2: Tour de France book on Tour stage
Product:
priority: 10
book_tags: ["tour"]
Page:
title: "Stage 9 - Tour de France 2025"
Calculation:
10 (base) + 0 (no terrain) + 0 (no season) + 10 (book match) = 20 points
Example 3: Energy gel (year-round, all terrains)
Product:
priority: 4
terrain_tags: ["flat", "hills-flat", "hills-uphill", "mountains-flat", "mountains-uphill"]
season_tags: [] # Empty = always matches
Page:
race_terrain: "flat"
date: "2025-09-21" (fall)
Calculation:
4 (base) + 5 (terrain match) + 3 (season match, empty = always match) + 0 = 12 points
Priority Guidelines
| Priority Range | Use Case | Examples |
|---|---|---|
| 1-3 | Low priority accessories | Basic tools, generic items |
| 4-6 | Medium priority consumables | Nutrition, standard clothing |
| 7-8 | High priority equipment | Quality bikes, wheelsets |
| 9-10 | Premium/specialized items | Top-end bikes, rare books |
9. Seasonal Filtering Logic
Products are always active unless explicitly marked isActive: false. The season tags are used for scoring, not filtering.
How It Works
- All active products (
isActive: true) are considered - Products with matching
season_tagsget +3 bonus points - Products with empty
season_tagsare year-round and always get the season bonus - Top-scored products are selected (after terrain and book bonuses)
Example Configuration
# Winter-specific jacket
- id: "clothing-castelli-perfetto"
season_tags: ["winter", "fall"]
isActive: true
# This gets +3 in winter/fall, 0 in spring/summer
# Year-round nutrition
- id: "nutrition-sis-gels"
season_tags: [] # Empty = always gets +3
isActive: true
# This ALWAYS gets +3 regardless of season
Disabling Products
To temporarily hide a product:
- id: "bike-out-of-stock"
isActive: false # ← Product won't appear in any recommendations
This is automatically managed by the availability checker (see GitHub Actions).
10. Daily Rotation Algorithm
To provide variety while maintaining consistency, products rotate daily based on the current date.
How It Works
- All matching products are scored
- Products are sorted by score (highest first)
- A daily seed is calculated:
day_of_year % product_count - Products are rotated using the seed
- Top 3 products are displayed
Implementation
In _includes/amazon-affiliate-block.html:
{% comment %} Calculate day of year for rotation {% endcomment %}
{% assign day_of_year = page.date | date: "%j" | plus: 0 %}
{% comment %} After sorting by score, rotate products {% endcomment %}
{% assign rotation_offset = day_of_year | modulo: sorted_products.size %}
{% comment %} Select top 3 after rotation {% endcomment %}
{% for i in (0..2) %}
{% assign index = i | plus: rotation_offset | modulo: sorted_products.size %}
{% assign product = sorted_products[index] %}
{% comment %} Display product {% endcomment %}
{% endfor %}
Example Rotation
Given 10 matching products on a stage page:
- Day 1 (day_of_year = 264): offset = 264 % 10 = 4 → show products [4, 5, 6]
- Day 2 (day_of_year = 265): offset = 265 % 10 = 5 → show products [5, 6, 7]
- Day 3 (day_of_year = 266): offset = 266 % 10 = 6 → show products [6, 7, 8]
This ensures:
- ✅ Same products shown all day (consistency)
- ✅ Different products tomorrow (variety)
- ✅ All products eventually shown (fairness)
11. Price Tracking System
Products can optionally display current prices fetched from Amazon.es.
Price Field
Each product in _data/amazon-products.yml can have a price_eur field:
- id: "electronics-garmin-edge-840"
name_es: "Garmin Edge 840"
categories:
- "electronics"
price_eur: 329.97 # Float value, no currency symbol
last_checked: "2025-11-21T09:35:39Z"
How Prices Are Fetched
The scripts/check_amazon_products.py script:
- Daily rotation: Checks specific product categories each weekday (see Section 12)
- HTTP GET request: Fetches full HTML page from Amazon
- BeautifulSoup parsing: Extracts price from multiple CSS selectors:
.a-price-whole(primary price element).priceToPay .a-price-whole(alternative location).a-offscreen(screen reader price)
- Price cleaning: Removes thousands separator (
.), converts comma to decimal (,→.) - Float conversion: Stores as float without currency symbol
- Timestamp update: Records ISO 8601 timestamp in
last_checked
Price Display in Templates
In _includes/amazon-affiliate-block.html:
{% if product.price_eur %}
<span class="product-price">{{ product.price_eur }}€</span>
{% endif %}
Conditional display: Prices only shown if price_eur exists and is not null.
Price Badge Styling
Prices are styled as badges matching the category badges:
.product-price {
display: inline-block;
background: #e3f2fd; /* Light blue background */
color: #0056b3; /* Dark blue text */
margin: 0 !important; /* No margin, inline with category */
margin-bottom: 0.8rem !important; /* Bottom spacing */
padding: 0.3rem 0.8rem;
border-radius: 12px;
font-size: 0.75rem;
font-weight: 700;
text-transform: uppercase;
}
Visual appearance:
- Blue badge next to category badge
- Uppercase text for consistency
- Same size and padding as category badges
- Displayed above product description
Price Accuracy
Important notes:
- Prices are informational only, not guaranteed to be current
- Amazon prices change frequently throughout the day
- Price displayed reflects last check time (see
last_checkedtimestamp) - Users always see current price on Amazon when they click through
- Daily rotation means each product category is checked only once per week
Price Scraping Challenges
Why not use Amazon API?
- Amazon Product Advertising API requires approval
- API has strict rate limits and query restrictions
- Associates program doesn’t grant automatic API access
Scraping limitations:
- Amazon’s HTML structure changes occasionally (requires maintenance)
- Rate limiting: 10-20 second delays between requests to avoid blocks
- Some products return 200 OK but have no price (“Currently unavailable”)
- Prices may differ by region, Prime membership, or promotions
How to handle failures:
- If price extraction fails,
price_eurremains null (no price displayed) - Product still shows without price badge
- Next scheduled check will retry
Manual Price Updates
You can manually set prices in _data/amazon-products.yml:
- id: "wheels-mavic-cosmic-sl-45"
price_eur: 899.99
last_checked: "2025-11-21T10:00:00Z" # Manual timestamp
Use cases:
- Quick testing before scraper runs
- Override automated prices if scraper gets wrong value
- Set prices for new products immediately
12. Daily Category Rotation
To avoid Amazon rate limiting and reduce server load, the price checker uses a 7-day category rotation system.
Why Category Rotation?
Problem: Checking all 101 products daily would:
- Take 17-34 minutes (10-20s delay × 101 products)
- Risk Amazon blocking IP address for excessive requests
- Consume unnecessary GitHub Actions minutes
Solution: Check different product categories each weekday:
- Each category checked once per week
- Total check time: 1-7 minutes per day (depending on category size)
- Prices stay reasonably fresh (7-day maximum staleness)
Category Schedule
| Weekday | Category | Products Checked | Estimated Time |
|---|---|---|---|
| Monday | Wheels & Components | ~6 products | 1-2 minutes |
| Tuesday | Energy Gels | ~7 products | 1-2 minutes |
| Wednesday | Recovery Products | ~6 products | 1-2 minutes |
| Thursday | Hydration | ~3 products | 0.5-1 minute |
| Friday | Electronics | ~15 products | 2.5-5 minutes |
| Saturday | Clothing | ~41 products | 7-14 minutes |
| Sunday | Books | ~23 products | 4-8 minutes |
Implementation
In scripts/check_amazon_products.py:
CATEGORY_ROTATION = {
0: ['wheels', 'components'], # Monday
1: ['nutrition-energy'], # Tuesday
2: ['nutrition-recovery'], # Wednesday
3: ['nutrition-hydration'], # Thursday
4: ['electronics'], # Friday
5: ['clothing'], # Saturday
6: ['books'], # Sunday
}
def get_categories_to_check(check_all=False):
if check_all:
return None # Check all categories
weekday = datetime.now().weekday()
return CATEGORY_ROTATION.get(weekday, [])
How It Works
- Daily GitHub Actions run at 3 AM UTC
- Script checks current weekday (0=Monday, 6=Sunday)
- Filters products by category:
def product_matches_categories(product, categories): if categories is None: return True # Check all product_categories = product.get('categories', []) if isinstance(product_categories, str): product_categories = [product_categories] return any(cat in product_categories for cat in categories) - Checks only matching products
- Updates
price_eurandlast_checkedfields - Commits changes if prices updated
Manual Full Check
You can override rotation and check all products:
Command-line:
cd scripts
python3 check_amazon_products.py --all
GitHub Actions:
- Go to Actions tab
- Select “Amazon Product Availability Checker”
- Click “Run workflow”
- Check “Check all products (ignore rotation)” option
- Click “Run workflow”
This runs a full check of all 101 products (takes 17-34 minutes).
Rate Limiting Protection
Delays between requests:
import random
import time
# Initial delay before first request
time.sleep(random.uniform(3, 5))
# Between each product
time.sleep(random.uniform(10, 20))
User-Agent rotation:
USER_AGENTS = [
'Mozilla/5.0 (Windows NT 10.0; Win64; x64)...',
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)...',
'Mozilla/5.0 (X11; Linux x86_64)...'
]
headers = {'User-Agent': random.choice(USER_AGENTS)}
Why 10-20 seconds?
- Amazon typically allows 1-2 requests per second for light scraping
- 10-20s delay = 3-6 requests per minute (well under typical limits)
- Appears more like human browsing than bot behavior
- Reduces chance of CAPTCHA or IP blocking
Monitoring Rotation
Check which categories were updated:
# Show last_checked timestamps by category
grep -A 5 'categories:' _data/amazon-products.yml | grep -A 1 'last_checked'
View GitHub Actions logs:
- Go to Actions tab
- Click on latest workflow run
- See output:
Categories to check today (Monday): ['wheels', 'components'] Checking 6 products... Updated 5 products with new prices
13. Adding New Products
Step-by-Step Process
1. Find Product on Amazon.es
- Go to Amazon.es
- Search for cycling product: “specialized tarmac sl7”
- Open the product page
- Get the standard product URL:
https://amazon.es/dp/B08XXXXX
2. Extract Product Information
- Product name: From page title
- Price: Current price (for reference, not stored)
- Image URL: Right-click image → “Copy image address”
- Description: Write a concise Spanish description
3. Determine Tags
Terrain tags:
- Is this product better for flat, hilly, or mountain stages?
- Example: Aero bike →
["flat", "hills-flat"] - Example: Climbing bike →
["mountains-uphill"]
Season tags:
- When is this product most useful?
- Example: Winter jacket →
["winter", "fall"] - Example: Energy gel →
[](year-round, leave empty)
Book tags (only for books):
- Does this book cover a famous race?
- Example: Tour de France book →
["tour"]
4. Set Priority
- 1-3: Generic accessories
- 4-6: Standard consumables and clothing
- 7-8: Quality bikes and equipment
- 9-10: Premium items and specialized books
5. Add to _data/amazon-products.yml
- id: "wheels-mavic-cosmic-sl-45" # Unique kebab-case ID
name_es: "Mavic Cosmic SL 45" # Spanish product name
categories: # Array of categories
- "wheels"
amazon_link: "https://amazon.es/dp/B09XXXXX" # Standard Amazon URL
description_es: "Ruedas aerodinámicas de carbono, ideales para etapas llanas y contrarrelojes."
terrain_tags:
- "flat"
- "hills-flat"
season_tags:
- "spring"
- "summer"
- "fall"
priority: 8
price_eur: null # Will be filled by scraper
isActive: true
last_checked: null # ISO 8601 timestamp
failed_checks: 0
Field reference:
id: Unique identifier (kebab-case)name_es: Spanish product namecategories: Array of categories (can have multiple)amazon_link: Standard Amazon.es URL formatdescription_es: Spanish description (shown in product card)terrain_tags: Terrain types (see Section 6)season_tags: Seasons (see Section 9), empty = year-roundpriority: Base priority 1-10 (see Section 8)price_eur: Current price as float (filled by scraper)isActive: Boolean, set to false by scraper if unavailablelast_checked: ISO timestamp of last availability checkfailed_checks: Counter incremented on HTTP errors
6. Validate YAML Syntax
# Check for YAML errors
python3 -c "import yaml; yaml.safe_load(open('_data/amazon-products.yml'))"
# Or build Jekyll site
bundle exec jekyll build --trace
7. Test Locally
bundle exec jekyll serve
# Visit http://localhost:4000
# Navigate to a race page matching your product's tags
# Verify the product appears in recommendations
14. URL Validation
All Amazon URLs must follow the standard format for the availability checker to work.
Valid URL Format
https://amazon.es/dp/{ASIN}
Where {ASIN} is the 10-character Amazon Standard Identification Number.
Examples
✅ Valid:
https://amazon.es/dp/B08L87NP4M
https://www.amazon.es/dp/B08L87NP4M
❌ Invalid:
https://amazon.es/Specialized-Tarmac-SL7-Carbon/dp/B08L87NP4M/ref=xxx
https://amzn.to/3xYzAbc
https://amazon.com/dp/B08L87NP4M (wrong domain)
Regex Pattern
The validation script uses this pattern:
AMAZON_URL_PATTERN = re.compile(
r'^https?://(?:www\.)?amazon\.es/(?:.*/)dp/([A-Z0-9]{10})(?:/.*)?$',
re.IGNORECASE
)
How to Get Standard URL
- Go to product page on Amazon.es
- Look at URL bar
- Extract the ASIN (10-char code after
/dp/) - Construct:
https://amazon.es/dp/{ASIN}
Example:
Full URL:
https://www.amazon.es/Specialized-Tarmac-SL7-Comp-Carbon-Road-Bike/dp/B08L87NP4M/ref=sr_1_1?keywords=specialized+tarmac
Extracted ASIN:
B08L87NP4M
Standard URL:
https://amazon.es/dp/B08L87NP4M
15. Local Testing
Build and Serve
# Install dependencies
bundle install
# Serve locally with live reload
bundle exec jekyll serve --livereload
# Serve on specific port
bundle exec jekyll serve --port 4001
# Serve accessible from network
bundle exec jekyll serve --host 0.0.0.0
Test Scenarios
1. Test Affiliate Block Display
Navigate to a race page:
http://localhost:4000/stages/2025/09-september/21/2025-85thskoda-tour-de-luxembourg-2-pro/
Check:
- ✅ Affiliate block appears at bottom of page
- ✅ 3 products are displayed
- ✅ Products have images, names, descriptions
- ✅ “Ver en Amazon” buttons are present
- ✅ Amazon Associates disclosure text appears
2. Test Terrain Matching
Create a test race page with specific terrain:
---
layout: default
title: "Test Mountain Stage"
race_terrain: "mountains-uphill"
date: 2025-07-15
---
Test content here.
Expected: Products with terrain_tags: ["mountains-uphill"] should score higher and appear first.
3. Test Season Filtering
Change your system date or modify the page date field:
date: 2025-01-15 # Winter
Expected: Products with season_tags: ["winter"] should get +3 bonus points.
4. Test Book Recommendations
Navigate to a Tour de France stage:
http://localhost:4000/stages/2025/07-july/15/tour-de-france-stage-9/
Expected: Books with book_tags: ["tour"] should appear with high priority.
5. Test Daily Rotation
Build the site twice on different days:
# Day 1
bundle exec jekyll build
cat _site/stages/.../index.html | grep "product-card" -A 5
# Day 2 (change system date or page date)
bundle exec jekyll build
cat _site/stages/.../index.html | grep "product-card" -A 5
Expected: Different products in the top 3 on different days.
6. Test Analytics Tracking
- Open browser DevTools (F12)
- Go to Console tab
- Navigate to a race page
- Click “Ver en Amazon” button
- Check console for:
Tracking affiliate click: {
product_id: "bike-specialized-tarmac",
product_name: "Specialized Tarmac SL7",
race_terrain: "flat",
click_url: "https://amazon.es/dp/B08XXXXX?tag=yourname-21&...",
source_page: "/stages/2025/09-september/21/...",
value: 1,
currency: "EUR"
}
Debugging Tips
Products not appearing?
Check:
- Is
isActive: truein product YAML? - Does page have
race_terrainin frontmatter? - Are
terrain_tagscompatible withrace_terrain? - Is current season matching
season_tags?
Affiliate tag not in URLs?
Check:
_config.ymlhasamazon_affiliate_tag: "yourname-21"- Jekyll site was rebuilt after config change
- Browser cache cleared (Ctrl+Shift+R)
CSS not loading?
Check:
assets/css/style.cssexists_layouts/default.htmlhas<link>tag to stylesheet- Jekyll build completed without errors
16. GitHub Actions Automation
The availability and price checker runs daily via GitHub Actions to keep prices fresh and ensure product URLs remain valid.
Workflow File
.github/workflows/check-amazon-products.yml
Schedule
schedule:
- cron: '0 3 * * *' # Every day at 3:00 AM UTC
Timezone: UTC (Coordinated Universal Time)
Frequency: Daily (365 times per year)
Category rotation: Each day checks specific categories (see Section 12)
Manual Trigger
You can also run the workflow manually:
- Go to GitHub repository
- Click Actions tab
- Select “Amazon Product Availability Checker” workflow
- Click “Run workflow” button
- Select branch (usually
main) - Optional: Check “Check all products (ignore rotation)” to run full check
- Click “Run workflow”
Default behavior: Follows daily rotation schedule
With “Check all” enabled: Checks all 101 products (takes 17-34 minutes)
What It Does
- Checks out repository code
- Sets up Python 3.11 environment
- Installs dependencies:
PyYAML,requests,beautifulsoup4 - Determines category to check based on weekday (or all if manual with check_all)
- Runs availability and price checker script
- Fetches prices from Amazon using BeautifulSoup HTML parsing
- Updates
price_eurandlast_checkedfields in YAML - Commits changes if products were updated
- Creates GitHub Issue if validation errors occurred
Exit Codes
| Exit Code | Meaning | Action Taken |
|---|---|---|
0 |
No changes needed | No commit, workflow ends |
1 |
Products updated | Commit changes to _data/amazon-products.yml |
2 |
Validation errors found | Create GitHub Issue with details |
Monitoring Workflow
View Workflow Runs
- Go to Actions tab in GitHub
- See list of past runs with status icons:
- ✅ Green check = success
- ❌ Red X = failure
- 🟡 Yellow dot = in progress
Read Workflow Logs
- Click on a workflow run
- Click on job name (e.g., “check-products”)
- Expand steps to see detailed logs
Check for Issues
If validation errors occur, a GitHub Issue is automatically created:
- Go to Issues tab
- Look for issue titled: “Amazon Product Availability Check Failed”
- Read error details in issue description
- Fix products in
_data/amazon-products.yml - Push changes to trigger another check
Availability and Price Logic
The checker script (scripts/check_amazon_products.py) performs:
- URL Validation:
- Ensures all URLs match
https://amazon.es/dp/{ASIN}pattern - Creates validation error if format is wrong
- Ensures all URLs match
- Category Filtering:
- Determines categories to check based on weekday
- Filters products matching today’s categories
- Uses all products if
--allflag set
- HTTP GET Request:
- Sends full GET request to fetch HTML page
- Uses rotating User-Agent headers
- Includes 10-20 second random delay between requests
- Initial 3-5 second delay before first request
- Status Code Analysis:
- 200 OK: Product available → reset
failed_checksto 0 - 404 Not Found: Product not found → increment
failed_checks - Other errors: Network/server issue → increment
failed_checks
- 200 OK: Product available → reset
- Price Extraction (BeautifulSoup):
- Parses HTML with BeautifulSoup4
- Tries multiple CSS selectors:
.a-price-whole,.priceToPay .a-price-whole,.a-offscreen - Cleans price string: removes thousands separator, converts comma to decimal
- Converts to float and stores in
price_eur - Updates
last_checkedwith ISO 8601 timestamp
- Deactivation Threshold:
- If
failed_checks >= 3→ setisActive: false - Product hidden from all recommendations
- If
- Reactivation:
- If previously failed product returns 200 OK → set
isActive: true, resetfailed_checksto 0
- If previously failed product returns 200 OK → set
Customizing Check Frequency
Current schedule: Daily at 3 AM UTC
Edit .github/workflows/check-amazon-products.yml:
# Current: Daily at 3 AM UTC
schedule:
- cron: '0 3 * * *'
# Alternative: Twice daily (3 AM and 3 PM UTC)
schedule:
- cron: '0 3,15 * * *'
# Alternative: Every 6 hours
schedule:
- cron: '0 */6 * * *'
# Alternative: Weekly only (original schedule)
schedule:
- cron: '0 3 * * 1' # Mondays only
Note: With daily rotation, running more than once per day won’t provide fresher prices (each category only checked once per day regardless of workflow frequency).
Use crontab.guru to generate cron expressions.
17. Google Analytics 4 Integration
Track affiliate link performance with Google Analytics 4.
Setup
1. Get Measurement ID
- Go to Google Analytics
- Create or select a GA4 property
- Go to Admin → Data Streams
- Select your web stream
- Copy Measurement ID (format:
G-XXXXXXXXXX)
2. Add to Configuration
Edit _config.yml:
google_analytics: "G-XXXXXXXXXX" # ← Your Measurement ID
3. Rebuild Site
bundle exec jekyll build
git add _config.yml
git commit -m "Add GA4 measurement ID"
git push
Custom Events
The system tracks affiliate clicks with custom events:
gtag('event', 'affiliate_click', {
'product_id': 'bike-specialized-tarmac',
'product_name': 'Specialized Tarmac SL7',
'race_terrain': 'flat',
'click_url': 'https://amazon.es/dp/B08XXXXX?tag=...',
'source_page': '/stages/2025/09-september/21/...',
'value': 1,
'currency': 'EUR'
});
Viewing Analytics
Real-Time Reports
- Go to Reports → Real-time
- Open a race page on your site
- Click an affiliate link
- See event appear in Real-time report (within 30 seconds)
Event Reports
- Go to Reports → Engagement → Events
- Find
affiliate_clickevent - Click to see details:
- Event count
- Users who triggered event
- Event parameters (product_id, race_terrain, etc.)
Creating Custom Dashboard
- Go to Explore → Blank
- Add dimensions:
Event nameproduct_id(custom parameter)race_terrain(custom parameter)
- Add metrics:
Event countTotal users
- Create visualizations:
- Table: Product performance by ID
- Bar chart: Clicks by race terrain
- Time series: Daily affiliate clicks
Conversion Tracking
To track actual Amazon purchases (requires Amazon Associates):
- Get conversion data from Amazon Associates dashboard
- Manually import to GA4, or
- Use Google Tag Manager with Amazon OneLink integration
Note: Amazon doesn’t provide pixel tracking, so conversion data must be matched manually via UTM parameters.
UTM Parameters
All affiliate links include UTM parameters for tracking:
https://amazon.es/dp/B08XXXXX?tag=yourname-21&utm_source=ciclismohoy&utm_medium=affiliate&utm_campaign=race-page&utm_content=product-id
- utm_source:
ciclismohoy(your site name) - utm_medium:
affiliate(traffic type) - utm_campaign:
race-pageorhomepage(page type) - utm_content: Product ID (e.g.,
bike-specialized-tarmac)
Privacy Compliance
The GA4 implementation respects privacy:
- Cookie consent required: GA4 only loads if user accepts cookies (see
_includes/cookie-consent.html) - IP anonymization: Enabled by default in GA4
- Data retention: Set to 14 months (configurable in GA4 settings)
- User deletion: Can be requested via privacy policy contact
18. Legal Compliance
Amazon Associates Requirements
Operating Agreement
You must comply with the Amazon Associates Operating Agreement:
- Clear disclosure: All affiliate links must be clearly disclosed
- Accurate content: Product information must be truthful
- No prohibited content: No illegal, hateful, or violent content
- Link freshness: Regularly update product links
Disclosure Text
Our implementation includes disclosure in:
- Affiliate block footer:
<p class="affiliate-disclaimer"> Como Afiliados de Amazon, ganamos por compras calificadas. </p> -
Dedicated disclosure page:
/amazon-disclosure/ - Footer link: Every page has link to disclosure page
GDPR Compliance
Cookie Consent
The site uses _includes/cookie-consent.html to:
- Inform users about cookies (Google Analytics, Amazon tracking)
- Request consent before loading GA4
- Provide opt-out option
Privacy Policy
Update /privacy-policy.md to include:
- Data collected: Affiliate clicks, product views
- How data is used: Analytics, improving recommendations
- Third parties: Amazon, Google Analytics
- User rights: Access, deletion, opt-out
Cookie Policy
Update /cookie-policy.md to include:
- First-party cookies: Session, preferences
- Third-party cookies: Google Analytics, Amazon Associates
- How to disable: Browser settings, consent banner
FTC Guidelines (US)
If targeting US audience, follow FTC Endorsement Guidelines:
- Clear and conspicuous disclosure (already implemented)
- Proximity to affiliate link (disclosure in same block as products)
- Plain language (Spanish: “ganamos por compras calificadas”)
Country-Specific Requirements
Spain (Where Site Operates)
- ✅ RGPD/GDPR compliance: Cookie consent banner
- ✅ LSSI (Ley de Servicios de la Sociedad de la Información): Disclosure page
- ✅ Consumer protection: Accurate product information
EU General
- ✅ Consumer Rights Directive: Clear pricing, no hidden fees
- ✅ ePrivacy Directive: Cookie consent before tracking
19. Troubleshooting
Common Issues
Prices Not Displaying
Symptom: Products show but no price badges appear
Possible Causes:
price_eurfield is null (product not checked yet)- Price scraper failed to extract price
- Product’s category not checked this week
- CSS not applied to
.product-priceclass
Solutions:
# Check if product has price_eur value
grep -A 10 'id: "electronics-garmin' _data/amazon-products.yml
# Run checker manually for specific category
cd scripts
python3 check_amazon_products.py # Uses daily rotation
python3 check_amazon_products.py --all # Checks all products
# Check Jekyll build includes prices
grep -A 5 'product-price' _site/index.html
# Verify CSS loaded
curl -s http://localhost:4000/assets/css/style.css | grep 'product-price'
Expected behavior:
- Products without
price_eurshow no price badge (normal) - Products with
price_eur: nullshow no price badge - Products with
price_eur: 329.97show “329.97€” in blue badge - Each category checked once per week via daily rotation
Products Not Appearing
Symptom: Affiliate block is empty or shows “No products available”
Possible Causes:
- All products have
isActive: false - No products match current terrain/season
- YAML syntax error in product database
Solutions:
# Check YAML syntax
python3 -c "import yaml; print(yaml.safe_load(open('_data/amazon-products.yml')))"
# Check if any products are active
grep "isActive: true" _data/amazon-products.yml
# Rebuild Jekyll site
bundle exec jekyll clean
bundle exec jekyll build --trace
Wrong Products Displayed
Symptom: Products don’t match the race terrain
Possible Causes:
race_terrainnot set in frontmatter- Product
terrain_tagsmisconfigured - Scoring algorithm not working
Debug:
{% comment %} Add to _includes/amazon-affiliate-block.html {% endcomment %}
<p>Debug: race_terrain = {{ page.race_terrain }}</p>
<p>Debug: current_season = {{ current_season }}</p>
Affiliate Tag Missing from URLs
Symptom: Amazon URLs don’t include ?tag=yourname-21
Possible Causes:
amazon_affiliate_tagnot set in_config.yml- Jekyll site not rebuilt after config change
- Browser caching old HTML
Solutions:
# Verify config
grep "amazon_affiliate_tag" _config.yml
# Rebuild site
bundle exec jekyll clean
bundle exec jekyll build
# Clear browser cache (Ctrl+Shift+R on most browsers)
GA4 Events Not Tracking
Symptom: No affiliate_click events in GA4 Real-time report
Possible Causes:
google_analyticsnot set in_config.yml- User rejected cookie consent
- Ad blocker blocking GA4 script
- JavaScript errors in console
Debug:
// Open browser DevTools (F12) → Console
// Check for errors like:
// "gtag is not defined"
// "Failed to load resource: net::ERR_BLOCKED_BY_CLIENT"
// Test gtag manually:
gtag('event', 'test_event', {'test_param': 'test_value'});
Solutions:
- Set
google_analytics: "G-XXXXXXXXXX"in_config.yml - Accept cookies when prompted
- Disable ad blocker for testing
- Check
assets/js/affiliate-tracking.jsloaded correctly
GitHub Actions Workflow Failing
Symptom: Red X on workflow run, email notification of failure
Possible Causes:
- Python dependencies not installed
- YAML syntax error in product database
- Network timeout to Amazon
- Git push authentication failure
Debug:
# View workflow logs on GitHub
# Click Actions → workflow run → job → step
# Test locally
cd scripts
python3 check_amazon_products.py
echo $? # Check exit code
Common Errors:
Error: ModuleNotFoundError: No module named 'yaml'
# Fix: Add to workflow
- name: Install dependencies
run: pip install PyYAML requests
Error: yaml.scanner.ScannerError: mapping values are not allowed here
# Fix: Check _data/amazon-products.yml for syntax errors
# Common mistake: missing quotes around URLs with special chars
amazon_url: "https://amazon.es/dp/B08XXXXX" # Good
amazon_url: https://amazon.es/dp/B08X-XXX # Bad (dash after B08X)
Error: fatal: could not read Username for 'https://github.com'
# Fix: Workflow uses GITHUB_TOKEN automatically, but check:
- name: Commit changes
env:
GITHUB_TOKEN: $
Availability Checker Not Updating Products
Symptom: Products remain active despite being unavailable on Amazon
Possible Causes:
- Workflow not running (check schedule)
- Product URL returns 200 OK but product is actually unavailable
failed_checkscounter not incrementing
Debug:
# Run checker locally
cd scripts
python3 check_amazon_products.py
# Check output
cat ../_data/amazon-products.yml | grep -A 3 "failed_checks"
Note: Amazon sometimes returns 200 OK for unavailable products (shows “Currently unavailable” on page). The checker only detects 404 errors. Consider enhancing checker to parse HTML content if needed.
CSS Styling Issues
Symptom: Affiliate block looks broken or unstyled
Possible Causes:
assets/css/style.cssnot loaded- CSS rules overridden by other stylesheets
- Browser not applying responsive styles
Debug:
# Check if CSS file exists
ls -lh assets/css/style.css
# View compiled CSS in browser DevTools
# F12 → Elements → Styles → see which rules apply
Solutions:
/* Add !important to force styles (use sparingly) */
.affiliate-block {
margin: 3rem 0 !important;
}
/* Check media query syntax */
@media (max-width: 768px) {
.product-grid {
grid-template-columns: 1fr;
}
}
Support and Maintenance
Regular Maintenance Tasks
| Task | Frequency | Action |
|---|---|---|
| Review GA4 reports | Monthly | Check top products, optimize priorities |
| Update product database | Quarterly | Add new products, remove discontinued |
| Check Amazon Associates account | Monthly | Verify earnings, ensure compliance |
| Review availability checker logs | Weekly | Check for failed products, fix URLs |
| Update prices in YAML | As needed | Prices shown are informational only |
Getting Help
- Amazon Associates Support: https://afiliados.amazon.es/help
- Google Analytics Help: https://support.google.com/analytics
- Jekyll Documentation: https://jekyllrb.com/docs/
- GitHub Actions Docs: https://docs.github.com/actions
20. System Statistics
Current Database Size
- Total products: 101
- Wheels & Components: 6
- Nutrition (Energy Gels): 7
- Nutrition (Recovery): 6
- Nutrition (Hydration): 3
- Electronics: 15
- Clothing: 41
- Books: 23
Check Performance
- Daily rotation: 7-day cycle
- Products checked per day: 3-41 (varies by category)
- Average check time: 1-14 minutes per day
- Full check time: 17-34 minutes (all 101 products)
- Delay between requests: 10-20 seconds
- Price accuracy: Updated within 7 days
GitHub Actions Usage
- Workflow runs per month: ~30 (daily schedule)
- Minutes per run: 1-14 (varies by category)
- Total minutes per month: ~200-300
- GitHub free tier limit: 2,000 minutes/month ✅
Maintenance Schedule
| Task | Frequency | Last Completed |
|---|---|---|
| Product database expansion | Quarterly | November 21, 2025 |
| Price accuracy audit | Monthly | November 21, 2025 |
| CSS/styling updates | As needed | November 21, 2025 |
| Documentation update | Major changes | November 21, 2025 |
| Legal compliance review | Annually | November 20, 2025 |
Last Updated: November 21, 2025
Version: 2.0.0
Database Size: 101 products
Daily Rotation: Enabled (7-day cycle)
Price Tracking: Active (BeautifulSoup)
Maintained By: Ciclismo Hoy Development Team