Amazon Associates Affiliate System - Setup Guide

Amazon Associates Affiliate System - Setup Guide

Complete documentation for the intelligent Amazon affiliate integration in the Ciclismo Hoy cycling diary website.

Last Updated: November 21, 2025
Product Database Size: 101 products
Daily Rotation: Enabled (7-day cycle)

Table of Contents

  1. Prerequisites
  2. Configuration
  3. Product Database Overview
  4. Product Categories
  5. Race Terrain Tags
  6. Season Tags
  7. Book Tags
  8. Priority Scoring System
  9. Seasonal Filtering Logic
  10. Daily Rotation Algorithm
  11. Price Tracking System
  12. Daily Category Rotation
  13. Adding New Products
  14. URL Validation
  15. Local Testing
  16. GitHub Actions Automation
  17. Google Analytics 4 Integration
  18. Legal Compliance
  19. Troubleshooting

1. Prerequisites

Amazon Associates Account

  1. Sign up at Amazon Afiliados España
  2. Complete the application process:
    • Provide website URL: https://yourdomain.com
    • Describe your content: “Cycling race coverage and analysis”
    • Specify your niche: “Sports & Outdoors”
  3. Wait for approval (usually 1-3 business days)
  4. Get your affiliate tag (format: yourname-21)

Google Analytics 4 Account

  1. Create a GA4 property at Google Analytics
  2. Get your Measurement ID (format: G-XXXXXXXXXX)
  3. Enable enhanced measurement events
  4. Create custom events for affiliate tracking (optional, auto-created)

Development Environment

# Ruby and Jekyll
ruby --version  # >= 2.7.0
bundle --version

# Python for automation
python3 --version  # >= 3.8
pip3 install PyYAML requests beautifulsoup4

2. Configuration

Edit _config.yml

Replace the placeholder values:

# Amazon Affiliate Settings
amazon_affiliate_tag: "yourname-21"  # ← REPLACE with your Amazon Associates tag

# UTM Parameters for tracking
amazon_utm_source: "ciclismohoy"
amazon_utm_medium: "affiliate"

# Google Analytics
google_analytics: "G-XXXXXXXXXX"  # ← REPLACE with your GA4 Measurement ID

Verification

# Check configuration is valid YAML
bundle exec jekyll build --trace

If there are YAML syntax errors, the build will fail with details.


3. Product Database Overview

Current Database Statistics

  • Total Products: 101
  • Wheels: 6 (Shimano, Fulcrum)
  • Nutrition: 16 (gels, recovery drinks, isotonic)
  • Electronics: 15 (GPS computers, lights, radar)
  • Clothing: 41 (jerseys, shorts, jackets, accessories)
  • Books: 23 (Tour de France, grand tours, training, biographies)

Database Location

Products are stored in _data/amazon-products.yml

Field Structure

Field Type Required Description Example
id String Unique product identifier wheels-shimano-rs100-rear
name_es String Spanish product name Shimano RS100 Rueda Trasera
amazon_link String Amazon.es URL with tag placeholder https://amazon.es/dp/B07DDB6WM9?tag=ciclismohoy-21
categories Array Product categories [wheels, road, components]
race_terrain_tags Array Compatible race terrains [all] or [flat, hills-flat]
season_tags Array Suitable seasons [all-season] or [winter, fall]
book_tags Array Race keywords (books only) [tour-de-france, tour, tdf]
priority Integer Base scoring priority (1-10) 7
price_tier String Price category budget, mid, premium
price_eur Float Current price in euros 45.99
description_es String Short Spanish description (max 100 chars) Rueda trasera de carretera fiable
last_checked String ISO timestamp of last check 2025-11-21T09:35:39.532202Z
failed_checks Integer Consecutive failed checks (0-3) 0
isActive Boolean Product availability status true

Example Product Entry

- id: wheels-shimano-rs100-rear
  name_es: "Shimano RS100 Rueda Trasera Carretera"
  amazon_link: "https://www.amazon.es/Rueda-Tras-RS100-Pinza-QR-17C/dp/B07DDB6WM9?tag=ciclismohoy-21"
  categories: [wheels, road, components]
  race_terrain_tags: [all]
  season_tags: [all-season]
  priority: 5
  price_tier: budget
  price_eur: 45.99
  description_es: "Rueda trasera de carretera fiable para entrenamiento"
  last_checked: '2025-11-21T09:35:39.532202Z'
  failed_checks: 0
  isActive: true

4. Product Categories

Available Categories

Category Count Description Example Products
wheels 6 Wheelsets and components Shimano RS100, Fulcrum Racing Zero
nutrition 16 Sports nutrition products Energy gels, recovery drinks, isotonic
energy 7 Energy gels and bars Victory Endurance, 226ERS, HSN
recovery 6 Post-workout recovery Recovery drinks, protein powders
hydration 3 Isotonic drinks 226ERS, Powerbar, Crown Sport
electronics 15 GPS and safety devices Garmin, Wahoo, Bryton, lights, radar
gps 12 GPS cycling computers Edge 530/840, ELEMNT BOLT/ROAM
lights 3 Bike lights and safety Sigma Buster, iGPSPORT, Garmin Varia
clothing 41 Cycling apparel Jerseys, shorts, jackets, gloves
jersey 20 Cycling jerseys TDF official, Santic, Spiuk, Gore Wear
shorts 8 Cycling shorts/bibs Danish Endurance, Gore Wear, Spiuk
winter 12 Cold weather gear Thermal tights, jackets, gloves
books 23 Cycling literature Tour history, biographies, training

Category Usage

  • Primary category: First in the array (used for badge display)
  • Multiple categories: Products can belong to several categories for flexible filtering
  • Rotation grouping: Categories determine which day of week products are checked

5. Race Terrain Tags

The system detects race terrain from ProcyclingStats.com using CSS class selectors.

Terrain Detection Mapping

ProcyclingStats Class Terrain Tag Description Suitable Products
.icon.profile.p1 flat Flat stages Aero bikes, deep wheels, TT equipment
.icon.profile.p2 hills-flat Hilly finish, flat overall All-rounder bikes, mid-depth wheels
.icon.profile.p3 hills-uphill Hilly with uphill finish Climbing bikes, lightweight wheels
.icon.profile.p4 mountains-flat Mountains with flat finish All-rounder bikes, versatile gear
.icon.profile.p5 mountains-uphill Mountain top finish Lightweight bikes, climbing wheels

How Terrain is Captured

In scripts/scrape_and_generate.py, the detect_race_terrain() function:

def detect_race_terrain(soup):
    """
    Detect race terrain type from ProcyclingStats parcours profile icon.
    
    Returns:
        str: One of 'flat', 'hills-flat', 'hills-uphill', 'mountains-flat', 
             'mountains-uphill', or None
    """
    profile_icon = soup.select_one('.icon.profile.p1, .icon.profile.p2, '
                                   '.icon.profile.p3, .icon.profile.p4, '
                                   '.icon.profile.p5')
    
    if profile_icon:
        if 'p1' in profile_icon.get('class', []):
            return 'flat'
        elif 'p2' in profile_icon.get('class', []):
            return 'hills-flat'
        elif 'p3' in profile_icon.get('class', []):
            return 'hills-uphill'
        elif 'p4' in profile_icon.get('class', []):
            return 'mountains-flat'
        elif 'p5' in profile_icon.get('class', []):
            return 'mountains-uphill'
    
    return None

This terrain tag is then saved in the frontmatter of each race markdown file:

---
title: "Stage 9 - Tour de France 2025"
race_terrain: "mountains-uphill"  # ← Automatically detected
---

Product Configuration

When adding products, specify which terrains they’re suitable for:

# Example: Climbing bike
terrain_tags:
  - "hills-uphill"
  - "mountains-flat"
  - "mountains-uphill"

# Example: Aero bike
terrain_tags:
  - "flat"
  - "hills-flat"

Terrain Bonus: Products matching the page’s race_terrain receive +5 points in scoring.


6. Season Tags

Products can be tagged for specific seasons based on typical weather conditions in cycling.

Season Mapping

Season Months Typical Races Product Examples
winter Nov, Dec, Jan, Feb Early season training races Thermal jackets, long bibs, gloves
spring Mar, Apr, May Paris-Roubaix, Amstel Gold Spring classics gear, windbreakers
summer Jun, Jul, Aug Tour de France, Giro, Vuelta Lightweight jerseys, sunglasses
fall Sep, Oct Vuelta a España, Il Lombardia Transitional clothing, arm warmers

Season Detection

In _includes/amazon-affiliate-block.html, the current season is determined from the page date:


{% assign month = page.date | date: "%m" | plus: 0 %}
{% if month == 11 or month == 12 or month == 1 or month == 2 %}
  {% assign current_season = "winter" %}
{% elsif month >= 3 and month <= 5 %}
  {% assign current_season = "spring" %}
{% elsif month >= 6 and month <= 8 %}
  {% assign current_season = "summer" %}
{% else %}
  {% assign current_season = "fall" %}
{% endif %}

Product Configuration

# Example: Winter jacket
season_tags:
  - "winter"
  - "fall"  # Can work in late fall too

# Example: Summer jersey (year-round in warm climates)
season_tags:
  - "spring"
  - "summer"
  - "fall"

Season Bonus: Products matching the current season receive +3 points in scoring.

Important: Leave season_tags empty for year-round products (e.g., energy gels, basic maintenance tools).


7. Book Tags

Books about famous races receive special treatment with the book_tags field.

Supported Book Tags

Tag Race Example Titles
tour Tour de France “The Official Tour de France Records”
giro Giro d’Italia “Giro d’Italia: The Story of the World’s Most Beautiful Race”
vuelta Vuelta a España “Vuelta: Inside Spain’s Greatest Race”
roubaix Paris-Roubaix “Paris-Roubaix: A Journey Through Hell”
monuments Classic Monuments “The Monuments: The Grit and the Glory”
biography Rider biographies “Lance Armstrong: Every Second Counts” (if applicable)

Book Detection Logic

In _includes/amazon-affiliate-block.html:


{% assign is_famous_race = false %}
{% assign page_title_lower = page.title | downcase %}

{% if page_title_lower contains "tour" or 
      page_title_lower contains "giro" or 
      page_title_lower contains "vuelta" or 
      page_title_lower contains "roubaix" %}
  {% assign is_famous_race = true %}
{% endif %}

Product Configuration

# Example: Tour de France book
- id: "book-tour-de-france-official"
  name: "Tour de France: The Official History"
  category: "books"
  book_tags:
    - "tour"
  priority: 10  # High priority for books

Book Bonus:

  • Books matching race keywords receive +10 points
  • Books are always shown when is_famous_race = true (guaranteed slot in top 3)

8. Priority Scoring System

Each product receives a dynamic score based on multiple factors.

Scoring Formula

TOTAL_SCORE = base_priority + terrain_bonus + season_bonus + book_bonus

Where:
- base_priority    = product.priority (1-10)
- terrain_bonus    = +5 if terrain matches, 0 otherwise
- season_bonus     = +3 if season matches, 0 otherwise  
- book_bonus       = +10 if book tag matches race, 0 otherwise

Scoring Examples

Example 1: Climbing bike on mountain stage in summer

Product:
  priority: 8
  terrain_tags: ["mountains-uphill"]
  season_tags: ["spring", "summer", "fall"]

Page:
  race_terrain: "mountains-uphill"
  date: "2025-07-15" (summer)

Calculation:
  8 (base) + 5 (terrain match) + 3 (season match) + 0 (not a book) = 16 points

Example 2: Tour de France book on Tour stage

Product:
  priority: 10
  book_tags: ["tour"]
  
Page:
  title: "Stage 9 - Tour de France 2025"
  
Calculation:
  10 (base) + 0 (no terrain) + 0 (no season) + 10 (book match) = 20 points

Example 3: Energy gel (year-round, all terrains)

Product:
  priority: 4
  terrain_tags: ["flat", "hills-flat", "hills-uphill", "mountains-flat", "mountains-uphill"]
  season_tags: []  # Empty = always matches

Page:
  race_terrain: "flat"
  date: "2025-09-21" (fall)

Calculation:
  4 (base) + 5 (terrain match) + 3 (season match, empty = always match) + 0 = 12 points

Priority Guidelines

Priority Range Use Case Examples
1-3 Low priority accessories Basic tools, generic items
4-6 Medium priority consumables Nutrition, standard clothing
7-8 High priority equipment Quality bikes, wheelsets
9-10 Premium/specialized items Top-end bikes, rare books

9. Seasonal Filtering Logic

Products are always active unless explicitly marked isActive: false. The season tags are used for scoring, not filtering.

How It Works

  1. All active products (isActive: true) are considered
  2. Products with matching season_tags get +3 bonus points
  3. Products with empty season_tags are year-round and always get the season bonus
  4. Top-scored products are selected (after terrain and book bonuses)

Example Configuration

# Winter-specific jacket
- id: "clothing-castelli-perfetto"
  season_tags: ["winter", "fall"]
  isActive: true
  # This gets +3 in winter/fall, 0 in spring/summer

# Year-round nutrition
- id: "nutrition-sis-gels"
  season_tags: []  # Empty = always gets +3
  isActive: true
  # This ALWAYS gets +3 regardless of season

Disabling Products

To temporarily hide a product:

- id: "bike-out-of-stock"
  isActive: false  # ← Product won't appear in any recommendations

This is automatically managed by the availability checker (see GitHub Actions).


10. Daily Rotation Algorithm

To provide variety while maintaining consistency, products rotate daily based on the current date.

How It Works

  1. All matching products are scored
  2. Products are sorted by score (highest first)
  3. A daily seed is calculated: day_of_year % product_count
  4. Products are rotated using the seed
  5. Top 3 products are displayed

Implementation

In _includes/amazon-affiliate-block.html:


{% comment %} Calculate day of year for rotation {% endcomment %}
{% assign day_of_year = page.date | date: "%j" | plus: 0 %}

{% comment %} After sorting by score, rotate products {% endcomment %}
{% assign rotation_offset = day_of_year | modulo: sorted_products.size %}

{% comment %} Select top 3 after rotation {% endcomment %}
{% for i in (0..2) %}
  {% assign index = i | plus: rotation_offset | modulo: sorted_products.size %}
  {% assign product = sorted_products[index] %}
  {% comment %} Display product {% endcomment %}
{% endfor %}

Example Rotation

Given 10 matching products on a stage page:

  • Day 1 (day_of_year = 264): offset = 264 % 10 = 4 → show products [4, 5, 6]
  • Day 2 (day_of_year = 265): offset = 265 % 10 = 5 → show products [5, 6, 7]
  • Day 3 (day_of_year = 266): offset = 266 % 10 = 6 → show products [6, 7, 8]

This ensures:

  • ✅ Same products shown all day (consistency)
  • ✅ Different products tomorrow (variety)
  • ✅ All products eventually shown (fairness)

11. Price Tracking System

Products can optionally display current prices fetched from Amazon.es.

Price Field

Each product in _data/amazon-products.yml can have a price_eur field:

- id: "electronics-garmin-edge-840"
  name_es: "Garmin Edge 840"
  categories:
    - "electronics"
  price_eur: 329.97  # Float value, no currency symbol
  last_checked: "2025-11-21T09:35:39Z"

How Prices Are Fetched

The scripts/check_amazon_products.py script:

  1. Daily rotation: Checks specific product categories each weekday (see Section 12)
  2. HTTP GET request: Fetches full HTML page from Amazon
  3. BeautifulSoup parsing: Extracts price from multiple CSS selectors:
    • .a-price-whole (primary price element)
    • .priceToPay .a-price-whole (alternative location)
    • .a-offscreen (screen reader price)
  4. Price cleaning: Removes thousands separator (.), converts comma to decimal (,.)
  5. Float conversion: Stores as float without currency symbol
  6. Timestamp update: Records ISO 8601 timestamp in last_checked

Price Display in Templates

In _includes/amazon-affiliate-block.html:


{% if product.price_eur %}
  <span class="product-price">{{ product.price_eur }}€</span>
{% endif %}

Conditional display: Prices only shown if price_eur exists and is not null.

Price Badge Styling

Prices are styled as badges matching the category badges:

.product-price {
  display: inline-block;
  background: #e3f2fd;  /* Light blue background */
  color: #0056b3;       /* Dark blue text */
  margin: 0 !important; /* No margin, inline with category */
  margin-bottom: 0.8rem !important; /* Bottom spacing */
  padding: 0.3rem 0.8rem;
  border-radius: 12px;
  font-size: 0.75rem;
  font-weight: 700;
  text-transform: uppercase;
}

Visual appearance:

  • Blue badge next to category badge
  • Uppercase text for consistency
  • Same size and padding as category badges
  • Displayed above product description

Price Accuracy

Important notes:

  • Prices are informational only, not guaranteed to be current
  • Amazon prices change frequently throughout the day
  • Price displayed reflects last check time (see last_checked timestamp)
  • Users always see current price on Amazon when they click through
  • Daily rotation means each product category is checked only once per week

Price Scraping Challenges

Why not use Amazon API?

  • Amazon Product Advertising API requires approval
  • API has strict rate limits and query restrictions
  • Associates program doesn’t grant automatic API access

Scraping limitations:

  • Amazon’s HTML structure changes occasionally (requires maintenance)
  • Rate limiting: 10-20 second delays between requests to avoid blocks
  • Some products return 200 OK but have no price (“Currently unavailable”)
  • Prices may differ by region, Prime membership, or promotions

How to handle failures:

  • If price extraction fails, price_eur remains null (no price displayed)
  • Product still shows without price badge
  • Next scheduled check will retry

Manual Price Updates

You can manually set prices in _data/amazon-products.yml:

- id: "wheels-mavic-cosmic-sl-45"
  price_eur: 899.99
  last_checked: "2025-11-21T10:00:00Z"  # Manual timestamp

Use cases:

  • Quick testing before scraper runs
  • Override automated prices if scraper gets wrong value
  • Set prices for new products immediately

12. Daily Category Rotation

To avoid Amazon rate limiting and reduce server load, the price checker uses a 7-day category rotation system.

Why Category Rotation?

Problem: Checking all 101 products daily would:

  • Take 17-34 minutes (10-20s delay × 101 products)
  • Risk Amazon blocking IP address for excessive requests
  • Consume unnecessary GitHub Actions minutes

Solution: Check different product categories each weekday:

  • Each category checked once per week
  • Total check time: 1-7 minutes per day (depending on category size)
  • Prices stay reasonably fresh (7-day maximum staleness)

Category Schedule

Weekday Category Products Checked Estimated Time
Monday Wheels & Components ~6 products 1-2 minutes
Tuesday Energy Gels ~7 products 1-2 minutes
Wednesday Recovery Products ~6 products 1-2 minutes
Thursday Hydration ~3 products 0.5-1 minute
Friday Electronics ~15 products 2.5-5 minutes
Saturday Clothing ~41 products 7-14 minutes
Sunday Books ~23 products 4-8 minutes

Implementation

In scripts/check_amazon_products.py:

CATEGORY_ROTATION = {
    0: ['wheels', 'components'],        # Monday
    1: ['nutrition-energy'],            # Tuesday
    2: ['nutrition-recovery'],          # Wednesday
    3: ['nutrition-hydration'],         # Thursday
    4: ['electronics'],                 # Friday
    5: ['clothing'],                    # Saturday
    6: ['books'],                       # Sunday
}

def get_categories_to_check(check_all=False):
    if check_all:
        return None  # Check all categories
    
    weekday = datetime.now().weekday()
    return CATEGORY_ROTATION.get(weekday, [])

How It Works

  1. Daily GitHub Actions run at 3 AM UTC
  2. Script checks current weekday (0=Monday, 6=Sunday)
  3. Filters products by category:
    def product_matches_categories(product, categories):
        if categories is None:
            return True  # Check all
           
        product_categories = product.get('categories', [])
        if isinstance(product_categories, str):
            product_categories = [product_categories]
           
        return any(cat in product_categories for cat in categories)
    
  4. Checks only matching products
  5. Updates price_eur and last_checked fields
  6. Commits changes if prices updated

Manual Full Check

You can override rotation and check all products:

Command-line:

cd scripts
python3 check_amazon_products.py --all

GitHub Actions:

  1. Go to Actions tab
  2. Select “Amazon Product Availability Checker”
  3. Click “Run workflow”
  4. Check “Check all products (ignore rotation)” option
  5. Click “Run workflow”

This runs a full check of all 101 products (takes 17-34 minutes).

Rate Limiting Protection

Delays between requests:

import random
import time

# Initial delay before first request
time.sleep(random.uniform(3, 5))

# Between each product
time.sleep(random.uniform(10, 20))

User-Agent rotation:

USER_AGENTS = [
    'Mozilla/5.0 (Windows NT 10.0; Win64; x64)...',
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)...',
    'Mozilla/5.0 (X11; Linux x86_64)...'
]

headers = {'User-Agent': random.choice(USER_AGENTS)}

Why 10-20 seconds?

  • Amazon typically allows 1-2 requests per second for light scraping
  • 10-20s delay = 3-6 requests per minute (well under typical limits)
  • Appears more like human browsing than bot behavior
  • Reduces chance of CAPTCHA or IP blocking

Monitoring Rotation

Check which categories were updated:

# Show last_checked timestamps by category
grep -A 5 'categories:' _data/amazon-products.yml | grep -A 1 'last_checked'

View GitHub Actions logs:

  1. Go to Actions tab
  2. Click on latest workflow run
  3. See output:
    Categories to check today (Monday): ['wheels', 'components']
    Checking 6 products...
    Updated 5 products with new prices
    

13. Adding New Products

Step-by-Step Process

1. Find Product on Amazon.es

  1. Go to Amazon.es
  2. Search for cycling product: “specialized tarmac sl7”
  3. Open the product page
  4. Get the standard product URL: https://amazon.es/dp/B08XXXXX

2. Extract Product Information

  • Product name: From page title
  • Price: Current price (for reference, not stored)
  • Image URL: Right-click image → “Copy image address”
  • Description: Write a concise Spanish description

3. Determine Tags

Terrain tags:

  • Is this product better for flat, hilly, or mountain stages?
  • Example: Aero bike → ["flat", "hills-flat"]
  • Example: Climbing bike → ["mountains-uphill"]

Season tags:

  • When is this product most useful?
  • Example: Winter jacket → ["winter", "fall"]
  • Example: Energy gel → [] (year-round, leave empty)

Book tags (only for books):

  • Does this book cover a famous race?
  • Example: Tour de France book → ["tour"]

4. Set Priority

  • 1-3: Generic accessories
  • 4-6: Standard consumables and clothing
  • 7-8: Quality bikes and equipment
  • 9-10: Premium items and specialized books

5. Add to _data/amazon-products.yml

- id: "wheels-mavic-cosmic-sl-45"  # Unique kebab-case ID
  name_es: "Mavic Cosmic SL 45"     # Spanish product name
  categories:                        # Array of categories
    - "wheels"
  amazon_link: "https://amazon.es/dp/B09XXXXX"  # Standard Amazon URL
  description_es: "Ruedas aerodinámicas de carbono, ideales para etapas llanas y contrarrelojes."
  terrain_tags:
    - "flat"
    - "hills-flat"
  season_tags:
    - "spring"
    - "summer"
    - "fall"
  priority: 8
  price_eur: null                   # Will be filled by scraper
  isActive: true
  last_checked: null                # ISO 8601 timestamp
  failed_checks: 0

Field reference:

  • id: Unique identifier (kebab-case)
  • name_es: Spanish product name
  • categories: Array of categories (can have multiple)
  • amazon_link: Standard Amazon.es URL format
  • description_es: Spanish description (shown in product card)
  • terrain_tags: Terrain types (see Section 6)
  • season_tags: Seasons (see Section 9), empty = year-round
  • priority: Base priority 1-10 (see Section 8)
  • price_eur: Current price as float (filled by scraper)
  • isActive: Boolean, set to false by scraper if unavailable
  • last_checked: ISO timestamp of last availability check
  • failed_checks: Counter incremented on HTTP errors

6. Validate YAML Syntax

# Check for YAML errors
python3 -c "import yaml; yaml.safe_load(open('_data/amazon-products.yml'))"

# Or build Jekyll site
bundle exec jekyll build --trace

7. Test Locally

bundle exec jekyll serve
# Visit http://localhost:4000
# Navigate to a race page matching your product's tags
# Verify the product appears in recommendations

14. URL Validation

All Amazon URLs must follow the standard format for the availability checker to work.

Valid URL Format

https://amazon.es/dp/{ASIN}

Where {ASIN} is the 10-character Amazon Standard Identification Number.

Examples

Valid:

https://amazon.es/dp/B08L87NP4M
https://www.amazon.es/dp/B08L87NP4M

Invalid:

https://amazon.es/Specialized-Tarmac-SL7-Carbon/dp/B08L87NP4M/ref=xxx
https://amzn.to/3xYzAbc
https://amazon.com/dp/B08L87NP4M  (wrong domain)

Regex Pattern

The validation script uses this pattern:

AMAZON_URL_PATTERN = re.compile(
    r'^https?://(?:www\.)?amazon\.es/(?:.*/)dp/([A-Z0-9]{10})(?:/.*)?$',
    re.IGNORECASE
)

How to Get Standard URL

  1. Go to product page on Amazon.es
  2. Look at URL bar
  3. Extract the ASIN (10-char code after /dp/)
  4. Construct: https://amazon.es/dp/{ASIN}

Example:

Full URL: 
https://www.amazon.es/Specialized-Tarmac-SL7-Comp-Carbon-Road-Bike/dp/B08L87NP4M/ref=sr_1_1?keywords=specialized+tarmac

Extracted ASIN: 
B08L87NP4M

Standard URL:
https://amazon.es/dp/B08L87NP4M

15. Local Testing

Build and Serve

# Install dependencies
bundle install

# Serve locally with live reload
bundle exec jekyll serve --livereload

# Serve on specific port
bundle exec jekyll serve --port 4001

# Serve accessible from network
bundle exec jekyll serve --host 0.0.0.0

Test Scenarios

1. Test Affiliate Block Display

Navigate to a race page:

http://localhost:4000/stages/2025/09-september/21/2025-85thskoda-tour-de-luxembourg-2-pro/

Check:

  • ✅ Affiliate block appears at bottom of page
  • ✅ 3 products are displayed
  • ✅ Products have images, names, descriptions
  • ✅ “Ver en Amazon” buttons are present
  • ✅ Amazon Associates disclosure text appears

2. Test Terrain Matching

Create a test race page with specific terrain:

---
layout: default
title: "Test Mountain Stage"
race_terrain: "mountains-uphill"
date: 2025-07-15
---

Test content here.

Expected: Products with terrain_tags: ["mountains-uphill"] should score higher and appear first.

3. Test Season Filtering

Change your system date or modify the page date field:

date: 2025-01-15  # Winter

Expected: Products with season_tags: ["winter"] should get +3 bonus points.

4. Test Book Recommendations

Navigate to a Tour de France stage:

http://localhost:4000/stages/2025/07-july/15/tour-de-france-stage-9/

Expected: Books with book_tags: ["tour"] should appear with high priority.

5. Test Daily Rotation

Build the site twice on different days:

# Day 1
bundle exec jekyll build
cat _site/stages/.../index.html | grep "product-card" -A 5

# Day 2 (change system date or page date)
bundle exec jekyll build
cat _site/stages/.../index.html | grep "product-card" -A 5

Expected: Different products in the top 3 on different days.

6. Test Analytics Tracking

  1. Open browser DevTools (F12)
  2. Go to Console tab
  3. Navigate to a race page
  4. Click “Ver en Amazon” button
  5. Check console for:
Tracking affiliate click: {
  product_id: "bike-specialized-tarmac",
  product_name: "Specialized Tarmac SL7",
  race_terrain: "flat",
  click_url: "https://amazon.es/dp/B08XXXXX?tag=yourname-21&...",
  source_page: "/stages/2025/09-september/21/...",
  value: 1,
  currency: "EUR"
}

Debugging Tips

Products not appearing?

Check:

  1. Is isActive: true in product YAML?
  2. Does page have race_terrain in frontmatter?
  3. Are terrain_tags compatible with race_terrain?
  4. Is current season matching season_tags?

Affiliate tag not in URLs?

Check:

  1. _config.yml has amazon_affiliate_tag: "yourname-21"
  2. Jekyll site was rebuilt after config change
  3. Browser cache cleared (Ctrl+Shift+R)

CSS not loading?

Check:

  1. assets/css/style.css exists
  2. _layouts/default.html has <link> tag to stylesheet
  3. Jekyll build completed without errors

16. GitHub Actions Automation

The availability and price checker runs daily via GitHub Actions to keep prices fresh and ensure product URLs remain valid.

Workflow File

.github/workflows/check-amazon-products.yml

Schedule

schedule:
  - cron: '0 3 * * *'  # Every day at 3:00 AM UTC

Timezone: UTC (Coordinated Universal Time)

Frequency: Daily (365 times per year)

Category rotation: Each day checks specific categories (see Section 12)

Manual Trigger

You can also run the workflow manually:

  1. Go to GitHub repository
  2. Click Actions tab
  3. Select “Amazon Product Availability Checker” workflow
  4. Click “Run workflow” button
  5. Select branch (usually main)
  6. Optional: Check “Check all products (ignore rotation)” to run full check
  7. Click “Run workflow”

Default behavior: Follows daily rotation schedule

With “Check all” enabled: Checks all 101 products (takes 17-34 minutes)

What It Does

  1. Checks out repository code
  2. Sets up Python 3.11 environment
  3. Installs dependencies: PyYAML, requests, beautifulsoup4
  4. Determines category to check based on weekday (or all if manual with check_all)
  5. Runs availability and price checker script
  6. Fetches prices from Amazon using BeautifulSoup HTML parsing
  7. Updates price_eur and last_checked fields in YAML
  8. Commits changes if products were updated
  9. Creates GitHub Issue if validation errors occurred

Exit Codes

Exit Code Meaning Action Taken
0 No changes needed No commit, workflow ends
1 Products updated Commit changes to _data/amazon-products.yml
2 Validation errors found Create GitHub Issue with details

Monitoring Workflow

View Workflow Runs

  1. Go to Actions tab in GitHub
  2. See list of past runs with status icons:
    • ✅ Green check = success
    • ❌ Red X = failure
    • 🟡 Yellow dot = in progress

Read Workflow Logs

  1. Click on a workflow run
  2. Click on job name (e.g., “check-products”)
  3. Expand steps to see detailed logs

Check for Issues

If validation errors occur, a GitHub Issue is automatically created:

  1. Go to Issues tab
  2. Look for issue titled: “Amazon Product Availability Check Failed”
  3. Read error details in issue description
  4. Fix products in _data/amazon-products.yml
  5. Push changes to trigger another check

Availability and Price Logic

The checker script (scripts/check_amazon_products.py) performs:

  1. URL Validation:
    • Ensures all URLs match https://amazon.es/dp/{ASIN} pattern
    • Creates validation error if format is wrong
  2. Category Filtering:
    • Determines categories to check based on weekday
    • Filters products matching today’s categories
    • Uses all products if --all flag set
  3. HTTP GET Request:
    • Sends full GET request to fetch HTML page
    • Uses rotating User-Agent headers
    • Includes 10-20 second random delay between requests
    • Initial 3-5 second delay before first request
  4. Status Code Analysis:
    • 200 OK: Product available → reset failed_checks to 0
    • 404 Not Found: Product not found → increment failed_checks
    • Other errors: Network/server issue → increment failed_checks
  5. Price Extraction (BeautifulSoup):
    • Parses HTML with BeautifulSoup4
    • Tries multiple CSS selectors: .a-price-whole, .priceToPay .a-price-whole, .a-offscreen
    • Cleans price string: removes thousands separator, converts comma to decimal
    • Converts to float and stores in price_eur
    • Updates last_checked with ISO 8601 timestamp
  6. Deactivation Threshold:
    • If failed_checks >= 3 → set isActive: false
    • Product hidden from all recommendations
  7. Reactivation:
    • If previously failed product returns 200 OK → set isActive: true, reset failed_checks to 0

Customizing Check Frequency

Current schedule: Daily at 3 AM UTC

Edit .github/workflows/check-amazon-products.yml:

# Current: Daily at 3 AM UTC
schedule:
  - cron: '0 3 * * *'

# Alternative: Twice daily (3 AM and 3 PM UTC)
schedule:
  - cron: '0 3,15 * * *'

# Alternative: Every 6 hours
schedule:
  - cron: '0 */6 * * *'

# Alternative: Weekly only (original schedule)
schedule:
  - cron: '0 3 * * 1'  # Mondays only

Note: With daily rotation, running more than once per day won’t provide fresher prices (each category only checked once per day regardless of workflow frequency).

Use crontab.guru to generate cron expressions.


17. Google Analytics 4 Integration

Track affiliate link performance with Google Analytics 4.

Setup

1. Get Measurement ID

  1. Go to Google Analytics
  2. Create or select a GA4 property
  3. Go to AdminData Streams
  4. Select your web stream
  5. Copy Measurement ID (format: G-XXXXXXXXXX)

2. Add to Configuration

Edit _config.yml:

google_analytics: "G-XXXXXXXXXX"  # ← Your Measurement ID

3. Rebuild Site

bundle exec jekyll build
git add _config.yml
git commit -m "Add GA4 measurement ID"
git push

Custom Events

The system tracks affiliate clicks with custom events:

gtag('event', 'affiliate_click', {
  'product_id': 'bike-specialized-tarmac',
  'product_name': 'Specialized Tarmac SL7',
  'race_terrain': 'flat',
  'click_url': 'https://amazon.es/dp/B08XXXXX?tag=...',
  'source_page': '/stages/2025/09-september/21/...',
  'value': 1,
  'currency': 'EUR'
});

Viewing Analytics

Real-Time Reports

  1. Go to ReportsReal-time
  2. Open a race page on your site
  3. Click an affiliate link
  4. See event appear in Real-time report (within 30 seconds)

Event Reports

  1. Go to ReportsEngagementEvents
  2. Find affiliate_click event
  3. Click to see details:
    • Event count
    • Users who triggered event
    • Event parameters (product_id, race_terrain, etc.)

Creating Custom Dashboard

  1. Go to ExploreBlank
  2. Add dimensions:
    • Event name
    • product_id (custom parameter)
    • race_terrain (custom parameter)
  3. Add metrics:
    • Event count
    • Total users
  4. Create visualizations:
    • Table: Product performance by ID
    • Bar chart: Clicks by race terrain
    • Time series: Daily affiliate clicks

Conversion Tracking

To track actual Amazon purchases (requires Amazon Associates):

  1. Get conversion data from Amazon Associates dashboard
  2. Manually import to GA4, or
  3. Use Google Tag Manager with Amazon OneLink integration

Note: Amazon doesn’t provide pixel tracking, so conversion data must be matched manually via UTM parameters.

UTM Parameters

All affiliate links include UTM parameters for tracking:

https://amazon.es/dp/B08XXXXX?tag=yourname-21&utm_source=ciclismohoy&utm_medium=affiliate&utm_campaign=race-page&utm_content=product-id
  • utm_source: ciclismohoy (your site name)
  • utm_medium: affiliate (traffic type)
  • utm_campaign: race-page or homepage (page type)
  • utm_content: Product ID (e.g., bike-specialized-tarmac)

Privacy Compliance

The GA4 implementation respects privacy:

  1. Cookie consent required: GA4 only loads if user accepts cookies (see _includes/cookie-consent.html)
  2. IP anonymization: Enabled by default in GA4
  3. Data retention: Set to 14 months (configurable in GA4 settings)
  4. User deletion: Can be requested via privacy policy contact

Amazon Associates Requirements

Operating Agreement

You must comply with the Amazon Associates Operating Agreement:

  1. Clear disclosure: All affiliate links must be clearly disclosed
  2. Accurate content: Product information must be truthful
  3. No prohibited content: No illegal, hateful, or violent content
  4. Link freshness: Regularly update product links

Disclosure Text

Our implementation includes disclosure in:

  1. Affiliate block footer:
    <p class="affiliate-disclaimer">
      Como Afiliados de Amazon, ganamos por compras calificadas.
    </p>
    
  2. Dedicated disclosure page: /amazon-disclosure/

  3. Footer link: Every page has link to disclosure page

GDPR Compliance

The site uses _includes/cookie-consent.html to:

  1. Inform users about cookies (Google Analytics, Amazon tracking)
  2. Request consent before loading GA4
  3. Provide opt-out option

Privacy Policy

Update /privacy-policy.md to include:

  • Data collected: Affiliate clicks, product views
  • How data is used: Analytics, improving recommendations
  • Third parties: Amazon, Google Analytics
  • User rights: Access, deletion, opt-out

Update /cookie-policy.md to include:

  • First-party cookies: Session, preferences
  • Third-party cookies: Google Analytics, Amazon Associates
  • How to disable: Browser settings, consent banner

FTC Guidelines (US)

If targeting US audience, follow FTC Endorsement Guidelines:

  1. Clear and conspicuous disclosure (already implemented)
  2. Proximity to affiliate link (disclosure in same block as products)
  3. Plain language (Spanish: “ganamos por compras calificadas”)

Country-Specific Requirements

Spain (Where Site Operates)

  • RGPD/GDPR compliance: Cookie consent banner
  • LSSI (Ley de Servicios de la Sociedad de la Información): Disclosure page
  • Consumer protection: Accurate product information

EU General

  • Consumer Rights Directive: Clear pricing, no hidden fees
  • ePrivacy Directive: Cookie consent before tracking

19. Troubleshooting

Common Issues

Prices Not Displaying

Symptom: Products show but no price badges appear

Possible Causes:

  1. price_eur field is null (product not checked yet)
  2. Price scraper failed to extract price
  3. Product’s category not checked this week
  4. CSS not applied to .product-price class

Solutions:

# Check if product has price_eur value
grep -A 10 'id: "electronics-garmin' _data/amazon-products.yml

# Run checker manually for specific category
cd scripts
python3 check_amazon_products.py  # Uses daily rotation
python3 check_amazon_products.py --all  # Checks all products

# Check Jekyll build includes prices
grep -A 5 'product-price' _site/index.html

# Verify CSS loaded
curl -s http://localhost:4000/assets/css/style.css | grep 'product-price'

Expected behavior:

  • Products without price_eur show no price badge (normal)
  • Products with price_eur: null show no price badge
  • Products with price_eur: 329.97 show “329.97€” in blue badge
  • Each category checked once per week via daily rotation

Products Not Appearing

Symptom: Affiliate block is empty or shows “No products available”

Possible Causes:

  1. All products have isActive: false
  2. No products match current terrain/season
  3. YAML syntax error in product database

Solutions:

# Check YAML syntax
python3 -c "import yaml; print(yaml.safe_load(open('_data/amazon-products.yml')))"

# Check if any products are active
grep "isActive: true" _data/amazon-products.yml

# Rebuild Jekyll site
bundle exec jekyll clean
bundle exec jekyll build --trace

Wrong Products Displayed

Symptom: Products don’t match the race terrain

Possible Causes:

  1. race_terrain not set in frontmatter
  2. Product terrain_tags misconfigured
  3. Scoring algorithm not working

Debug:


{% comment %} Add to _includes/amazon-affiliate-block.html {% endcomment %}
<p>Debug: race_terrain = {{ page.race_terrain }}</p>
<p>Debug: current_season = {{ current_season }}</p>

Affiliate Tag Missing from URLs

Symptom: Amazon URLs don’t include ?tag=yourname-21

Possible Causes:

  1. amazon_affiliate_tag not set in _config.yml
  2. Jekyll site not rebuilt after config change
  3. Browser caching old HTML

Solutions:

# Verify config
grep "amazon_affiliate_tag" _config.yml

# Rebuild site
bundle exec jekyll clean
bundle exec jekyll build

# Clear browser cache (Ctrl+Shift+R on most browsers)

GA4 Events Not Tracking

Symptom: No affiliate_click events in GA4 Real-time report

Possible Causes:

  1. google_analytics not set in _config.yml
  2. User rejected cookie consent
  3. Ad blocker blocking GA4 script
  4. JavaScript errors in console

Debug:

// Open browser DevTools (F12) → Console
// Check for errors like:
// "gtag is not defined"
// "Failed to load resource: net::ERR_BLOCKED_BY_CLIENT"

// Test gtag manually:
gtag('event', 'test_event', {'test_param': 'test_value'});

Solutions:

  1. Set google_analytics: "G-XXXXXXXXXX" in _config.yml
  2. Accept cookies when prompted
  3. Disable ad blocker for testing
  4. Check assets/js/affiliate-tracking.js loaded correctly

GitHub Actions Workflow Failing

Symptom: Red X on workflow run, email notification of failure

Possible Causes:

  1. Python dependencies not installed
  2. YAML syntax error in product database
  3. Network timeout to Amazon
  4. Git push authentication failure

Debug:

# View workflow logs on GitHub
# Click Actions → workflow run → job → step

# Test locally
cd scripts
python3 check_amazon_products.py
echo $?  # Check exit code

Common Errors:

Error: ModuleNotFoundError: No module named 'yaml'

# Fix: Add to workflow
- name: Install dependencies
  run: pip install PyYAML requests

Error: yaml.scanner.ScannerError: mapping values are not allowed here

# Fix: Check _data/amazon-products.yml for syntax errors
# Common mistake: missing quotes around URLs with special chars
amazon_url: "https://amazon.es/dp/B08XXXXX"  # Good
amazon_url: https://amazon.es/dp/B08X-XXX    # Bad (dash after B08X)

Error: fatal: could not read Username for 'https://github.com'

# Fix: Workflow uses GITHUB_TOKEN automatically, but check:
- name: Commit changes
  env:
    GITHUB_TOKEN: $

Availability Checker Not Updating Products

Symptom: Products remain active despite being unavailable on Amazon

Possible Causes:

  1. Workflow not running (check schedule)
  2. Product URL returns 200 OK but product is actually unavailable
  3. failed_checks counter not incrementing

Debug:

# Run checker locally
cd scripts
python3 check_amazon_products.py

# Check output
cat ../_data/amazon-products.yml | grep -A 3 "failed_checks"

Note: Amazon sometimes returns 200 OK for unavailable products (shows “Currently unavailable” on page). The checker only detects 404 errors. Consider enhancing checker to parse HTML content if needed.

CSS Styling Issues

Symptom: Affiliate block looks broken or unstyled

Possible Causes:

  1. assets/css/style.css not loaded
  2. CSS rules overridden by other stylesheets
  3. Browser not applying responsive styles

Debug:

# Check if CSS file exists
ls -lh assets/css/style.css

# View compiled CSS in browser DevTools
# F12 → Elements → Styles → see which rules apply

Solutions:

/* Add !important to force styles (use sparingly) */
.affiliate-block {
  margin: 3rem 0 !important;
}

/* Check media query syntax */
@media (max-width: 768px) {
  .product-grid {
    grid-template-columns: 1fr;
  }
}

Support and Maintenance

Regular Maintenance Tasks

Task Frequency Action
Review GA4 reports Monthly Check top products, optimize priorities
Update product database Quarterly Add new products, remove discontinued
Check Amazon Associates account Monthly Verify earnings, ensure compliance
Review availability checker logs Weekly Check for failed products, fix URLs
Update prices in YAML As needed Prices shown are informational only

Getting Help


20. System Statistics

Current Database Size

  • Total products: 101
    • Wheels & Components: 6
    • Nutrition (Energy Gels): 7
    • Nutrition (Recovery): 6
    • Nutrition (Hydration): 3
    • Electronics: 15
    • Clothing: 41
    • Books: 23

Check Performance

  • Daily rotation: 7-day cycle
  • Products checked per day: 3-41 (varies by category)
  • Average check time: 1-14 minutes per day
  • Full check time: 17-34 minutes (all 101 products)
  • Delay between requests: 10-20 seconds
  • Price accuracy: Updated within 7 days

GitHub Actions Usage

  • Workflow runs per month: ~30 (daily schedule)
  • Minutes per run: 1-14 (varies by category)
  • Total minutes per month: ~200-300
  • GitHub free tier limit: 2,000 minutes/month ✅

Maintenance Schedule

Task Frequency Last Completed
Product database expansion Quarterly November 21, 2025
Price accuracy audit Monthly November 21, 2025
CSS/styling updates As needed November 21, 2025
Documentation update Major changes November 21, 2025
Legal compliance review Annually November 20, 2025

Last Updated: November 21, 2025
Version: 2.0.0
Database Size: 101 products
Daily Rotation: Enabled (7-day cycle)
Price Tracking: Active (BeautifulSoup)
Maintained By: Ciclismo Hoy Development Team

Productos Recomendados

Como Afiliados de Amazon, ganamos por compras calificadas. Precios orientativos, sujetos a cambios.
books

La Historia Oficial del Tour de Francia (Edición Española)

Historia oficial del Tour de Francia en español

Ver en Amazon
books

Paris-Roubaix: Un Viaje por el Infierno del Norte

La mítica carrera de los adoquines del norte de Francia

Ver en Amazon
books

Official History of the Tour de France (English)

Historia oficial del Tour de Francia en inglés (edición revisada)

Ver en Amazon