Spaces:

docs4you
/

tetete

Paused

App Files Files Community

docs4you commited on Jul 1

Commit

b66af9f

verified ·

1 Parent(s): 0102861

Upload 8 files

Browse files

Files changed (8) hide show

.dockerignore +18 -0
Dockerfile +23 -0
README.md +78 -10
docker-compose.yml +17 -0
requirements.txt +8 -0
src/__pycache__/main.cpython-311.pyc +0 -0
src/main.py +690 -0
src/scraper.py +949 -0

.dockerignore ADDED Viewed

	@@ -0,0 +1,18 @@

+# .dockerignore
+__pycache__/
+*.pyc
+*.pyo
+*.pyd
+.git/
+.gitignore
+README.md
+.env
+.venv/
+venv/
+.pytest_cache/
+.coverage
+htmlcov/
+.tox/
+dist/
+build/
+*.egg-info/

Dockerfile ADDED Viewed

	@@ -0,0 +1,23 @@

+# Dockerfile
+FROM python:3.11-slim
+# Set working directory
+WORKDIR /app
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    curl \
+    && rm -rf /var/lib/apt/lists/*
+# Copy requirements and install Python dependencies
+COPY requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+# Create directories
+RUN mkdir -p /app/mappings /app/playlists
+# Expose port
+EXPOSE 6680
+# Command to run the application
+CMD ["python", "main.py"]

README.md CHANGED Viewed

@@ -1,10 +1,78 @@
----
-title: Tetete
-emoji: 📉
-colorFrom: purple
-colorTo: green
-sdk: docker
-pinned: false
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# README.md
+# Complete FSTV Proxy Server
+Docker-based IPTV proxy server with automated scraping and playlist generation.
+## 🚀 Features
+- **Encoded URLs**: Clean proxy URLs like `/match/a7k9mq3x.m3u8`
+- **Auto-scraping**: Daily at 12:05 AM UTC
+- **Download endpoints**: Direct playlist/EPG downloads (EPG Shows Live Events from when scraped -- it is messed up, no time to work on it ATM)
+- **Sports + TV**: Combined matches and channels
+- **Real-time streams**: Live HLS extraction from FSTV
+## 📁 Setup
+1. **Create structure:**
+```bash
+mkdir fstv-proxy && cd fstv-proxy
+mkdir src mappings playlists
+```
+2. **Copy files:**
+   - Copy Docker files to root
+   - Copy Python files to `src/`
+   - Copy FSTV data files to root
+3. **Build and run:**
+```bash
+docker-compose build
+docker-compose up -d
+```
+## 🌐 Server Endpoints
+**Base URL**: `http://your-server:6680`
+### Download Endpoints:
+- `GET /playlist/matches.m3u8` - Sports matches playlist
+- `GET /playlist/channels.m3u8` - TV channels playlist
+- `GET /playlist/combined.m3u8` - Combined playlist
+- `GET /epg/matches.xml` - Sports EPG
+### Streaming Endpoints:
+- `GET /match/{id}.m3u8` - Match stream
+- `GET /channel/{id}.m3u8` - TV channel stream
+### Control Endpoints:
+- `POST /scrape-now` - Manual scrape
+- `GET /scrape-status` - Scrape info
+- `GET /health` - Health check
+- `GET /stats` - Server stats
+## ⏰ Auto-Scraping
+- **Schedule**: 12:05 AM UTC daily (Can add more/multiple times in main.py)
+- **Covers**: Full day events (00:10 - 23:45)
+- **Updates**: Mappings + playlists automatically
+## 🔧 Development
+- Edit `src/main.py` or `src/scraper.py`
+- Run `docker-compose restart`
+- No rebuilding needed!
+## 📊 Usage Example
+```bash
+# Get playlists
+curl http://your-server:6680/playlist/combined.m3u8
+(/playlist/matches.m3u8 - Sports matches playlist)
+(/playlist/channels.m3u8 - TV channels playlist)
+# Manual scrape
+curl -X POST http://your-server:6680/scrape-now
+# Check status
+curl http://your-server:6680/health
+```

docker-compose.yml ADDED Viewed

	@@ -0,0 +1,17 @@

+# docker-compose.yml
+version: '3.8'
+services:
+  fstv-proxy:
+    build: .
+    network_mode: "host"
+    volumes:
+      - ./src:/app
+      - ./mappings:/app/mappings
+      - ./playlists:/app/playlists
+    environment:
+      - PYTHONUNBUFFERED=1
+      - LOG_LEVEL=INFO
+      - TZ=UTC
+    restart: unless-stopped
+    container_name: fstv_proxy_server

requirements.txt ADDED Viewed

	@@ -0,0 +1,8 @@

+# requirements.txt
+fastapi==0.104.1
+uvicorn[standard]==0.24.0
+httpx==0.25.2
+aiofiles==23.2.0
+python-multipart==0.0.6
+apscheduler==3.10.4
+pytz==2023.3

src/__pycache__/main.cpython-311.pyc ADDED Viewed

Binary file (30.7 kB). View file

src/main.py ADDED Viewed

	@@ -0,0 +1,690 @@

+#!/usr/bin/env python3
+"""
+FSTV Proxy Server with Integrated Scraper
+- Handles encoded URLs for matches and TV channels
+- Auto-scrapes FSTV at 12:05 AM daily
+- Provides download endpoints for playlists and EPG
+- Implements v3?v4 flower transformation
+- Routes all streams through fast-fstv.duckdns.org:4123 proxy
+- Server: http://fast-fstv.duckdns.org:6680
+"""
+import json
+import os
+import re
+import asyncio
+import subprocess
+import base64
+from datetime import datetime, timezone
+from urllib.parse import urljoin, urlparse, quote
+from typing import Optional, Dict, Any
+from contextlib import asynccontextmanager
+import httpx
+import uvicorn
+from fastapi import FastAPI, HTTPException, Request
+from fastapi.responses import Response, PlainTextResponse, FileResponse
+from apscheduler.schedulers.asyncio import AsyncIOScheduler
+import pytz
+# Global variables
+url_mappings: Dict[str, Dict[str, Any]] = {}
+http_client: Optional[httpx.AsyncClient] = None
+scheduler: Optional[AsyncIOScheduler] = None
+last_scrape_info = {"status": "not_run", "timestamp": None, "mappings_count": 0}
+# Configuration
+FSTV_BASE_URL = "https://fstv.space"
+USER_AGENT = "Mozilla/5.0 (X11; U; Linux x86_64; pl-PL; rv:2.0) Gecko/20110307 Firefox/4.0"
+REQUEST_TIMEOUT = 15
+SERVER_BASE_URL = "http://your-server:6680"
+PROXY_SERVER = "http://m3u-playlist-server:4123"
+# Scraping schedule - 12:05 AM daily
+SCRAPE_TIMES = [
+    "05 00 * * *",  # 12:05 AM - Right after midnight schedule refresh, add more if needed
+]
+@asynccontextmanager
+async def lifespan(app: FastAPI):
+    """Handle startup and shutdown events"""
+    global http_client, url_mappings, scheduler
+    # STARTUP
+    print("?? Starting FSTV Proxy Server...")
+    # Initialize HTTP client
+    http_client = httpx.AsyncClient(
+        headers={
+            "User-Agent": USER_AGENT,
+            "Referer": "https://fstv.space/",
+            "Origin": "https://fstv.space",
+            "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
+            "Accept-Language": "en-US,en;q=0.5",
+            "Cache-Control": "no-cache"
+        },
+        timeout=REQUEST_TIMEOUT,
+        follow_redirects=True
+    )
+    # Load URL mappings
+    await load_url_mappings()
+    # Run initial scrape if no mappings found
+    if len(url_mappings) == 0:
+        print("?? No existing mappings found - running initial scrape...")
+        await run_automatic_scraper()
+        print(f"? Initial scrape complete - loaded {len(url_mappings)} mappings")
+    # Initialize scheduler
+    scheduler = AsyncIOScheduler(timezone=pytz.UTC)
+    # Add scraping jobs
+    for scrape_time in SCRAPE_TIMES:
+        minute, hour, day, month, day_of_week = scrape_time.split()
+        scheduler.add_job(
+            run_automatic_scraper,
+            'cron',
+            hour=int(hour),
+            minute=int(minute),
+            id=f"scraper_{hour}_{minute}"
+        )
+        print(f"? Scheduled scraper for {hour}:{minute:0>2} UTC daily")
+    scheduler.start()
+    print(f"? Server initialized")
+    print(f"   ?? {len(url_mappings)} URL mappings loaded")
+    print(f"   ?? Auto-scraper scheduled for 12:05 AM UTC daily")
+    print(f"   ?? Server URL: {SERVER_BASE_URL}")
+    yield
+    # SHUTDOWN
+    if scheduler:
+        scheduler.shutdown()
+    if http_client:
+        await http_client.aclose()
+    print("?? FSTV Proxy Server shut down")
+app = FastAPI(
+    title="FSTV Proxy Server",
+    description="Complete IPTV proxy server for FSTV sports and TV channels",
+    version="2.0.0",
+    lifespan=lifespan
+)
+async def load_url_mappings():
+    """Load URL mappings from JSON files"""
+    global url_mappings
+    url_mappings = {}
+    mapping_files = [
+        "/app/mappings/url_mappings_matches.json",
+        "/app/mappings/url_mappings_channels.json"
+    ]
+    for file_path in mapping_files:
+        if os.path.exists(file_path):
+            try:
+                with open(file_path, 'r', encoding='utf-8') as f:
+                    file_mappings = json.load(f)
+                    url_mappings.update(file_mappings)
+                    print(f"? Loaded {len(file_mappings)} mappings from {os.path.basename(file_path)}")
+            except Exception as e:
+                print(f"? Failed to load {file_path}: {e}")
+        else:
+            print(f"?? Mapping file not found: {file_path}")
+async def run_automatic_scraper():
+    """Run the scraper automatically"""
+    global last_scrape_info
+    print("?? Running automatic scraper...")
+    try:
+        # Run the scraper
+        result = subprocess.run(
+            ["python", "/app/scraper.py"],
+            capture_output=True,
+            text=True,
+            cwd="/app"
+        )
+        if result.returncode == 0:
+            print("? Automatic scraper completed successfully")
+            await load_url_mappings()
+            last_scrape_info = {
+                "status": "success",
+                "timestamp": datetime.now(timezone.utc).isoformat(),
+                "mappings_count": len(url_mappings),
+                "output": result.stdout[-500:] if result.stdout else ""  # Last 500 chars
+            }
+        else:
+            print(f"? Scraper failed with return code {result.returncode}")
+            print(f"Error: {result.stderr}")
+            last_scrape_info = {
+                "status": "error",
+                "timestamp": datetime.now(timezone.utc).isoformat(),
+                "error": result.stderr,
+                "return_code": result.returncode
+            }
+    except Exception as e:
+        print(f"? Error running automatic scraper: {e}")
+        last_scrape_info = {
+            "status": "exception",
+            "timestamp": datetime.now(timezone.utc).isoformat(),
+            "error": str(e)
+        }
+# ============================================================================
+# STREAMING HELPER FUNCTIONS
+# ============================================================================
+def encode_headers_for_proxy(headers: dict) -> str:
+    """Encode headers as base64 for proxy server"""
+    try:
+        # Convert headers to pipe-separated format
+        header_parts = []
+        for key, value in headers.items():
+            header_parts.append(f"{key}={value}")
+        header_string = "|".join(header_parts)
+        # Base64 encode
+        encoded = base64.b64encode(header_string.encode('utf-8')).decode('utf-8')
+        return encoded
+    except Exception as e:
+        print(f"? Error encoding headers: {e}")
+        return ""
+async def fetch_v3_to_v4_url(v3_url: str) -> Optional[str]:
+    """Fetch v3 URL and extract v4 variant URL"""
+    try:
+        print(f"   ?? Fetching v3 master: {v3_url}")
+        response = await http_client.get(v3_url)
+        if response.status_code != 200:
+            print(f"   ? HTTP {response.status_code} - Failed to fetch v3 master")
+            return None
+        content = response.text
+        # Extract v4 path from master playlist
+        v4_path = extract_v4_path_from_playlist(content)
+        if not v4_path:
+            print(f"   ? No v4 path found in master playlist")
+            return None
+        # Build complete v4 URL
+        parsed_v3 = urlparse(v3_url)
+        base_url = f"{parsed_v3.scheme}://{parsed_v3.netloc}"
+        v4_url = base_url + v4_path
+        print(f"   ? Extracted v4 URL: {v4_url}")
+        return v4_url
+    except Exception as e:
+        print(f"   ? Error fetching v3 URL: {e}")
+        return None
+def extract_v4_path_from_playlist(playlist_content: str) -> Optional[str]:
+    """Extract v4 path from master playlist"""
+    try:
+        lines = playlist_content.split('\n')
+        best_bandwidth = 0
+        best_path = None
+        for i, line in enumerate(lines):
+            line = line.strip()
+            if line.startswith('#EXT-X-STREAM-INF'):
+                # Extract bandwidth
+                bandwidth_match = re.search(r'BANDWIDTH=(\d+)', line)
+                if bandwidth_match:
+                    bandwidth = int(bandwidth_match.group(1))
+                    # Get the next line which should be the path
+                    if i + 1 < len(lines):
+                        path_line = lines[i + 1].strip()
+                        if path_line and not path_line.startswith('#'):
+                            # Look for v4 paths specifically
+                            if '/v4/' in path_line:
+                                if bandwidth > best_bandwidth:
+                                    best_bandwidth = bandwidth
+                                    best_path = path_line
+        if best_path:
+            print(f"   ?? Best v4 path: {best_bandwidth} bps -> {best_path}")
+            return best_path
+        else:
+            print(f"   ?? No v4 paths found in playlist")
+            return None
+    except Exception as e:
+        print(f"   ? Error extracting v4 path: {e}")
+        return None
+async def proxy_stream_through_4123(stream_url: str, headers: dict) -> Optional[str]:
+    try:
+        encoded_url = quote(stream_url, safe='')
+        encoded_headers = encode_headers_for_proxy(headers)
+        proxy_url = f"{PROXY_SERVER}?url={encoded_url}&data={encoded_headers}"
+        # Browser-like headers for 4123 server
+        browser_headers = {
+            "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
+            "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
+            "Accept-Language": "en-US,en;q=0.5",
+            "Accept-Encoding": "gzip, deflate",
+            "DNT": "1",
+            "Connection": "keep-alive",
+            "Upgrade-Insecure-Requests": "1"
+        }
+        response = await http_client.get(proxy_url, headers=browser_headers)
+        if response.status_code == 200:
+            print(f"   ? Proxy returned HLS content")
+            return response.text
+        else:
+            print(f"   ? Proxy failed: HTTP {response.status_code}")
+            print(f"   ?? Response: {response.text[:200]}")
+            return None
+    except Exception as e:
+        print(f"   ? Error proxying stream: {e}")
+        return None
+async def extract_streaming_urls_from_html(html_content: str) -> list:
+    """Extract streaming URLs from match page HTML"""
+    streaming_urls = []
+    # Pattern 1: Look for direct .m3u8 URLs
+    m3u8_pattern = r'https://[^"\'\s<>]+\.m3u8[^"\'\s<>]*'
+    m3u8_matches = re.findall(m3u8_pattern, html_content)
+    for url in m3u8_matches:
+        url = url.strip()
+        if is_valid_streaming_url(url):
+            streaming_urls.append(url)
+    # Pattern 2: Look for JavaScript variables
+    js_url_pattern = r'["\']https://[^"\'<>]+\.m3u8[^"\'<>]*["\']'
+    js_matches = re.findall(js_url_pattern, html_content)
+    for match in js_matches:
+        url = match.strip('"\'')
+        if is_valid_streaming_url(url) and url not in streaming_urls:
+            streaming_urls.append(url)
+    return streaming_urls
+def is_valid_streaming_url(url: str) -> bool:
+    """Check if URL looks like a valid streaming URL"""
+    try:
+        parsed = urlparse(url)
+        if not parsed.scheme or not parsed.netloc:
+            return False
+        if not url.endswith('.m3u8'):
+            return False
+        bad_patterns = ['javascript:', 'data:', 'blob:', 'about:']
+        for pattern in bad_patterns:
+            if url.lower().startswith(pattern):
+                return False
+        return True
+    except:
+        return False
+# ============================================================================
+# API ENDPOINTS
+# ============================================================================
+@app.get("/")
+async def root():
+    """Root endpoint with server info"""
+    return {
+        "service": "FSTV Proxy Server",
+        "version": "2.0.0",
+        "status": "running",
+        "server_url": SERVER_BASE_URL,
+        "mappings_loaded": len(url_mappings),
+        "last_scrape": last_scrape_info,
+        "endpoints": {
+            "streaming": [
+                "/match/{encoded_id}.m3u8",
+                "/channel/{encoded_id}.m3u8"
+            ],
+            "downloads": [
+                "/playlist/matches.m3u8",
+                "/playlist/channels.m3u8",
+                "/playlist/combined.m3u8",
+                "/epg/matches.xml"
+            ],
+            "control": [
+                "/scrape-now",
+                "/scrape-status",
+                "/health",
+                "/stats"
+            ]
+        }
+    }
+@app.get("/health")
+async def health_check():
+    """Health check endpoint"""
+    return {
+        "status": "healthy",
+        "timestamp": datetime.utcnow().isoformat(),
+        "mappings_count": len(url_mappings),
+        "last_scrape_status": last_scrape_info.get("status", "unknown")
+    }
+@app.get("/stats")
+async def get_stats():
+    """Get detailed server statistics"""
+    match_count = sum(1 for mapping in url_mappings.values() if mapping.get('type') == 'match')
+    channel_count = sum(1 for mapping in url_mappings.values() if mapping.get('type') == 'channel')
+    return {
+        "total_mappings": len(url_mappings),
+        "matches": match_count,
+        "channels": channel_count,
+        "server_time": datetime.utcnow().isoformat(),
+        "last_scrape": last_scrape_info,
+        "scheduled_scrapes": [
+            {"time": "00:05 UTC", "description": "Daily automatic scrape"}
+        ]
+    }
+# ============================================================================
+# DOWNLOAD ENDPOINTS
+# ============================================================================
+@app.get("/playlist/matches.m3u8")
+async def download_matches_playlist():
+    """Download matches M3U playlist"""
+    file_path = "/app/playlists/fstv_matches_encoded.m3u"
+    if not os.path.exists(file_path):
+        raise HTTPException(status_code=404, detail="Matches playlist not found. Run scraper first.")
+    return FileResponse(
+        path=file_path,
+        media_type="application/vnd.apple.mpegurl",
+        filename="fstv_matches.m3u"
+    )
+@app.get("/playlist/channels.m3u8")
+async def download_channels_playlist():
+    """Download TV channels M3U playlist"""
+    file_path = "/app/playlists/fstv_tv_channels_encoded.m3u"
+    if not os.path.exists(file_path):
+        raise HTTPException(status_code=404, detail="Channels playlist not found. Run scraper first.")
+    return FileResponse(
+        path=file_path,
+        media_type="application/vnd.apple.mpegurl",
+        filename="fstv_channels.m3u"
+    )
+@app.get("/playlist/combined.m3u8")
+async def download_combined_playlist():
+    """Download combined matches + channels playlist"""
+    try:
+        combined_content = "#EXTM3U url-tvg=\"http://fast-fstv.duckdns.org:6680/epg/matches.xml\"\n"
+        # Add matches
+        matches_file = "/app/playlists/fstv_matches_encoded.m3u"
+        if os.path.exists(matches_file):
+            with open(matches_file, 'r', encoding='utf-8') as f:
+                content = f.read()
+                # Skip the #EXTM3U line and add the rest
+                lines = content.split('\n')[1:]
+                combined_content += '\n'.join(lines) + '\n'
+        # Add channels
+        channels_file = "/app/playlists/fstv_tv_channels_encoded.m3u"
+        if os.path.exists(channels_file):
+            with open(channels_file, 'r', encoding='utf-8') as f:
+                content = f.read()
+                # Skip the #EXTM3U line and add the rest
+                lines = content.split('\n')[1:]
+                combined_content += '\n'.join(lines) + '\n'
+        return PlainTextResponse(
+            content=combined_content,
+            media_type="application/vnd.apple.mpegurl",
+            headers={"Content-Disposition": "attachment; filename=fstv_combined.m3u"}
+        )
+    except Exception as e:
+        raise HTTPException(status_code=500, detail=f"Failed to generate combined playlist: {str(e)}")
+@app.get("/epg/matches.xml")
+async def download_matches_epg():
+    """Download matches EPG/XMLTV file"""
+    file_path = "/app/playlists/fstv_matches_encoded.xml"
+    if not os.path.exists(file_path):
+        raise HTTPException(status_code=404, detail="EPG file not found. Run scraper first.")
+    return FileResponse(
+        path=file_path,
+        media_type="application/xml",
+        filename="fstv_epg.xml"
+    )
+# ============================================================================
+# SCRAPER CONTROL ENDPOINTS
+# ============================================================================
+@app.post("/scrape-now")
+async def manual_scrape():
+    """Manually trigger scraper"""
+    print("?? Manual scrape triggered...")
+    try:
+        result = subprocess.run(
+            ["python", "/app/scraper.py"],
+            capture_output=True,
+            text=True,
+            cwd="/app"
+        )
+        if result.returncode == 0:
+            await load_url_mappings()
+            return {
+                "status": "success",
+                "message": f"Scraper completed. Loaded {len(url_mappings)} mappings.",
+                "timestamp": datetime.utcnow().isoformat(),
+                "output": result.stdout[-1000:] if result.stdout else ""
+            }
+        else:
+            return {
+                "status": "error",
+                "message": "Scraper failed",
+                "error": result.stderr,
+                "return_code": result.returncode,
+                "timestamp": datetime.utcnow().isoformat()
+            }
+    except Exception as e:
+        return {
+            "status": "exception",
+            "message": str(e),
+            "timestamp": datetime.utcnow().isoformat()
+        }
+@app.get("/scrape-status")
+async def get_scrape_status():
+    """Get last scrape status and info"""
+    return last_scrape_info
+# ============================================================================
+# STREAMING ENDPOINTS
+# ============================================================================
+@app.get("/match/{encoded_id}.m3u8")
+async def get_match_stream(encoded_id: str, request: Request):
+    """Handle match stream requests with v3?v4?4123 flow"""
+    print(f"?? Match request: {encoded_id} from {request.client.host}")
+    # Look up the encoded ID
+    if encoded_id not in url_mappings:
+        print(f"? Match ID not found: {encoded_id}")
+        raise HTTPException(status_code=404, detail="Match not found")
+    mapping = url_mappings[encoded_id]
+    if mapping.get('type') != 'match':
+        print(f"? Invalid type for match request: {mapping.get('type')}")
+        raise HTTPException(status_code=400, detail="Invalid match ID")
+    fstv_path = mapping.get('fstv_path')
+    if not fstv_path:
+        print(f"? No FSTV path found for: {encoded_id}")
+        raise HTTPException(status_code=500, detail="Invalid mapping data")
+    try:
+        # Step 1: Fetch FSTV match page to extract v3 URL
+        match_url = FSTV_BASE_URL + fstv_path
+        print(f"   ??? Fetching match page: {match_url}")
+        response = await http_client.get(match_url)
+        if response.status_code == 404:
+            print(f"   ?? Match page not found (404) - may not be live yet")
+            raise HTTPException(status_code=503, detail="Match not available yet")
+        if response.status_code != 200:
+            print(f"   ? HTTP {response.status_code} - Failed to fetch match page")
+            raise HTTPException(status_code=503, detail="Match page unavailable")
+        # Step 2: Extract v3 streaming URLs from page
+        streaming_urls = await extract_streaming_urls_from_html(response.text)
+        if not streaming_urls:
+            print(f"   ?? No streaming URLs found in match page")
+            raise HTTPException(status_code=503, detail="No stream available")
+        v3_url = streaming_urls[0]  # Use first found URL
+        print(f"   ? Found v3 URL: {v3_url}")
+        # Step 3: Transform v3 ? v4
+        v4_url = await fetch_v3_to_v4_url(v3_url)
+        if not v4_url:
+            print(f"   ? Failed to get v4 URL")
+            raise HTTPException(status_code=503, detail="Stream transformation failed")
+        # Step 4: Proxy through 4123 with headers
+        headers = {
+            "Referer": "https://fstv.space",
+            "Origin": "https://fstv.space",
+            "User-Agent": USER_AGENT
+        }
+        hls_content = await proxy_stream_through_4123(v4_url, headers)
+        if not hls_content:
+            print(f"   ? Failed to proxy stream through 4123")
+            raise HTTPException(status_code=503, detail="Stream proxy failed")
+        print(f"? Match stream delivered: {encoded_id}")
+        return PlainTextResponse(
+            content=hls_content,
+            media_type="application/vnd.apple.mpegurl",
+            headers={
+                "Cache-Control": "no-cache, no-store, must-revalidate",
+                "Pragma": "no-cache",
+                "Expires": "0"
+            }
+        )
+    except HTTPException:
+        raise
+    except Exception as e:
+        print(f"? Error processing match {encoded_id}: {e}")
+        raise HTTPException(status_code=500, detail="Stream processing failed")
+@app.get("/channel/{encoded_id}.m3u8")
+async def get_channel_stream(encoded_id: str, request: Request):
+    """Handle TV channel stream requests with v3?v4?4123 flow"""
+    print(f"?? Channel request: {encoded_id} from {request.client.host}")
+    # Look up the encoded ID
+    if encoded_id not in url_mappings:
+        print(f"? Channel ID not found: {encoded_id}")
+        raise HTTPException(status_code=404, detail="Channel not found")
+    mapping = url_mappings[encoded_id]
+    if mapping.get('type') != 'channel':
+        print(f"? Invalid type for channel request: {mapping.get('type')}")
+        raise HTTPException(status_code=400, detail="Invalid channel ID")
+    original_stream_url = mapping.get('original_stream_url')
+    if not original_stream_url:
+        print(f"? No stream URL found for: {encoded_id}")
+        raise HTTPException(status_code=500, detail="Invalid mapping data")
+    try:
+        # Step 1: We already have the v3 URL from channel mapping
+        v3_url = original_stream_url
+        print(f"   ?? Channel v3 URL: {v3_url}")
+        # Step 2: Transform v3 ? v4
+        v4_url = await fetch_v3_to_v4_url(v3_url)
+        if not v4_url:
+            print(f"   ? Failed to get v4 URL for channel")
+            raise HTTPException(status_code=503, detail="Channel transformation failed")
+        # Step 3: Proxy through 4123 with headers
+        headers = {
+            "Referer": "https://fstv.space",
+            "Origin": "https://fstv.space",
+            "User-Agent": USER_AGENT
+        }
+        hls_content = await proxy_stream_through_4123(v4_url, headers)
+        if not hls_content:
+            print(f"   ? Failed to proxy channel through 4123")
+            raise HTTPException(status_code=503, detail="Channel proxy failed")
+        print(f"? Channel stream delivered: {encoded_id}")
+        return PlainTextResponse(
+            content=hls_content,
+            media_type="application/vnd.apple.mpegurl",
+            headers={
+                "Cache-Control": "no-cache, no-store, must-revalidate",
+                "Pragma": "no-cache",
+                "Expires": "0"
+            }
+        )
+    except HTTPException:
+        raise
+    except Exception as e:
+        print(f"? Error processing channel {encoded_id}: {e}")
+        raise HTTPException(status_code=500, detail="Channel processing failed")
+if __name__ == "__main__":
+    print("?? Starting FSTV Proxy Server on port 6680...")
+    print("?? A CanBert ENT / Creation")
+    print("?? Playlist downloads available at /playlist/ endpoints")
+    print("?? Auto-scraper scheduled for 12:05 AM UTC daily, add more as needed.")
+    print("?? All streams proxy through m3u playlist proxy:4123")
+    uvicorn.run(
+        "main:app",
+        host="0.0.0.0",
+        port=6680,
+        reload=True,
+        log_level="info"
+    )

src/scraper.py ADDED Viewed

	@@ -0,0 +1,949 @@

+#!/usr/bin/env python3
+"""
+Combined FSTV Live Scraper
+- Fetches LIVE data from FSTV endpoints
+- Scrapes both sports matches and TV channels
+- Finds ALL matches (live, upcoming, finished) from all sections
+- Generates encoded URLs with database mapping
+- Outputs M3U playlists and EPG files
+- Designed for automated daily execution at 12:05 AM
+"""
+import re
+import os
+import json
+import random
+import string
+import asyncio
+import base64
+from datetime import datetime, timezone, timedelta
+import xml.etree.ElementTree as ET
+from html import unescape
+import pytz
+import httpx
+class CombinedFSTVScraper:
+    """
+    Combined live scraper for FSTV matches and TV channels
+    """
+    def __init__(self, debug=True, proxy_server="http://fast-fstv.duckdns.org:6680"):
+        self.debug = debug
+        self.mappings_dir = "/app/mappings"
+        self.playlists_dir = "/app/playlists"
+        self.proxy_server = proxy_server.rstrip('/')
+        # Timezone handling
+        self.api_timezone = pytz.timezone('US/Eastern')
+        self.utc = pytz.UTC
+        # HTTP client for fetching live data
+        self.http_client = None
+        # URL mappings
+        self.match_mappings = {}
+        self.channel_mappings = {}
+        # Stats
+        self.stats = {
+            "matches_found": 0,
+            "channels_found": 0,
+            "encoded_urls_generated": 0,
+            "files_generated": 0,
+            "http_requests": 0
+        }
+        # Base URL
+        self.base_url = "https://fstv.space"
+    async def init_http_client(self):
+        """Initialize HTTP client for FSTV requests"""
+        self.http_client = httpx.AsyncClient(
+            headers={
+                "User-Agent": "Mozilla/5.0 (X11; U; Linux x86_64; pl-PL; rv:2.0) Gecko/20110307 Firefox/4.0",
+                "Referer": "https://fstv.space/",
+                "Origin": "https://fstv.space",
+                "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
+                "Accept-Language": "en-US,en;q=0.5",
+                "Cache-Control": "no-cache"
+            },
+            timeout=30.0,
+            follow_redirects=True
+        )
+    async def close_http_client(self):
+        """Close HTTP client"""
+        if self.http_client:
+            await self.http_client.aclose()
+    def generate_encoded_id(self, length=8):
+        """Generate random encoded ID"""
+        characters = string.ascii_lowercase + string.digits
+        return ''.join(random.choice(characters) for _ in range(length))
+    def ensure_unique_id(self, existing_mappings):
+        """Generate unique ID across all mappings"""
+        all_mappings = {**self.match_mappings, **self.channel_mappings, **existing_mappings}
+        while True:
+            encoded_id = self.generate_encoded_id()
+            if encoded_id not in all_mappings:
+                return encoded_id
+    async def fetch_live_data(self):
+        """Fetch LIVE data from FSTV endpoints"""
+        print("?? Fetching LIVE data from FSTV...")
+        try:
+            # Fetch main page (sports matches)
+            print("   ?? Fetching sports matches from https://fstv.space")
+            matches_response = await self.http_client.get("https://fstv.space")
+            self.stats["http_requests"] += 1
+            if matches_response.status_code == 200:
+                self.matches_html = matches_response.text
+                print(f"   ? Sports data: {len(self.matches_html):,} characters")
+            else:
+                print(f"   ? Failed to fetch sports data: HTTP {matches_response.status_code}")
+                self.matches_html = None
+            # Fetch TV channels page
+            print("   ?? Fetching TV channels from https://fstv.space/live-tv.html")
+            channels_response = await self.http_client.get("https://fstv.space/live-tv.html")
+            self.stats["http_requests"] += 1
+            if channels_response.status_code == 200:
+                self.channels_html = channels_response.text
+                print(f"   ? Channels data: {len(self.channels_html):,} characters")
+            else:
+                print(f"   ? Failed to fetch channels data: HTTP {channels_response.status_code}")
+                self.channels_html = None
+            return self.matches_html is not None or self.channels_html is not None
+        except Exception as e:
+            print(f"? Error fetching live data: {e}")
+            return False
+    def parse_timestamp(self, timestamp_str):
+        """Convert Unix timestamp to UTC datetime"""
+        try:
+            timestamp = int(timestamp_str)
+            return datetime.fromtimestamp(timestamp, tz=self.utc)
+        except (ValueError, TypeError):
+            return None
+    def extract_matches_data(self):
+        """Extract matches data from ALL sections of FSTV page"""
+        if not self.matches_html:
+            print("?? No matches data to process")
+            return []
+        print("?? Extracting matches data from all sections...")
+        matches = []
+        # Pattern 1: Featured slider matches (slide-item)
+        print("   ?? Searching slider matches...")
+        slide_pattern = r'<div class="slide-item">(.*?)</div>\s*(?=<div class="slide-item"|$)'
+        slide_blocks = re.findall(slide_pattern, self.matches_html, re.DOTALL)
+        print(f"   Found {len(slide_blocks)} slider matches")
+        for block_html in slide_blocks:
+            match_data = self.parse_slide_block(block_html)
+            if match_data:
+                self.add_match_to_results(match_data, matches)
+        # Pattern 2: Common table rows (FIXED PATTERN)
+        print("   ?? Searching table row matches...")
+        table_pattern = r'<div[^>]*class="[^"]*common-table-row[^"]*table-row[^"]*"[^>]*onclick="window\.location\.href=\'([^\']+)\';?"[^>]*>(.*?)</div>'
+        table_matches = re.findall(table_pattern, self.matches_html, re.DOTALL)
+        print(f"   Found {len(table_matches)} table row matches")
+        for match_path, block_html in table_matches:
+            match_data = self.parse_table_row_block(block_html, match_path)
+            if match_data:
+                self.add_match_to_results(match_data, matches)
+        # Pattern 3: Direct match links (comprehensive fallback)
+        print("   ?? Searching direct match links...")
+        link_pattern = r'<a[^>]*href="(/match/[^"]+)"[^>]*>.*?</a>'
+        link_matches = re.findall(link_pattern, self.matches_html, re.DOTALL)
+        unique_links = list(set(link_matches))  # Remove duplicates
+        print(f"   Found {len(unique_links)} unique match links")
+        # For direct links, create basic match data
+        for match_path in unique_links:
+            # Skip if we already have this match
+            if any(m.get('match_path') == match_path for m in matches):
+                continue
+            # Extract info from URL
+            match_data = self.parse_match_path(match_path)
+            if match_data:
+                self.add_match_to_results(match_data, matches)
+        print(f"? Total processed {len(matches)} matches from all sections")
+        return matches
+    def add_match_to_results(self, match_data, matches):
+        """Add match to results with encoded ID and mapping"""
+        # Generate encoded ID
+        encoded_id = self.ensure_unique_id({})
+        match_data['encoded_id'] = encoded_id
+        # Add to mappings
+        self.match_mappings[encoded_id] = {
+            'fstv_path': match_data['match_path'],
+            'type': 'match',
+            'name': self.generate_display_name(match_data),
+            'league': match_data.get('league', 'Unknown'),
+            'status': match_data.get('status', 'Unknown'),
+            'timestamp': match_data.get('timestamp').isoformat() if match_data.get('timestamp') else None,
+            'created_at': datetime.now(self.utc).isoformat()
+        }
+        matches.append(match_data)
+        self.stats["matches_found"] += 1
+        self.stats["encoded_urls_generated"] += 1
+    def parse_slide_block(self, block_html):
+        """Parse individual slide-item block"""
+        try:
+            match_data = {}
+            # Extract timestamp
+            timestamp_match = re.search(r'data-timestamp="(\d+)"', block_html)
+            if timestamp_match:
+                match_data['timestamp'] = self.parse_timestamp(timestamp_match.group(1))
+            # Extract league from match-name
+            league_match = re.search(r'<span class="match-name">([^<]+)</span>', block_html)
+            if league_match:
+                match_data['league'] = league_match.group(1).strip()
+            # Extract match URL from btn-club link
+            url_match = re.search(r'<a class="btn-club[^"]*" href=([^>]+)>', block_html)
+            if url_match:
+                match_path = url_match.group(1).strip()
+                match_path = match_path.strip('\'"')
+                match_data['match_path'] = match_path
+            # Extract teams and scores
+            teams_data = self.extract_slide_teams(block_html)
+            if teams_data:
+                match_data.update(teams_data)
+            # Determine status based on scores and timestamp
+            self.determine_match_status(match_data)
+            if not match_data.get('league') or not match_data.get('teams'):
+                return None
+            return match_data
+        except Exception as e:
+            if self.debug:
+                print(f"?? Error parsing slide block: {e}")
+            return None
+    def parse_table_row_block(self, block_html, match_path):
+        """Parse table row block"""
+        try:
+            match_data = {'match_path': match_path}
+            # Extract timestamp if present
+            timestamp_match = re.search(r'data-timestamp="(\d+)"', block_html)
+            if timestamp_match:
+                match_data['timestamp'] = self.parse_timestamp(timestamp_match.group(1))
+            # Extract league info (table format)
+            league_match = re.search(r'<a[^>]*class="league-name"[^>]*alt="([^"]*)"[^>]*>([^<]+)</a>', block_html)
+            if league_match:
+                match_data['league'] = league_match.group(1) or league_match.group(2)
+            # Extract status from title attribute (table format)
+            status_match = re.search(r'<span[^>]*title="([^"]*)"[^>]*class="text-overflow">([^<]*)</span>', block_html)
+            if status_match:
+                title_status = status_match.group(1).strip()
+                if "Not Started" in title_status:
+                    match_data['status'] = 'Upcoming'
+                elif "Live" in title_status:
+                    match_data['status'] = 'Live'
+                elif "Finished" in title_status or "Final" in title_status:
+                    match_data['status'] = 'FT'
+                else:
+                    match_data['status'] = 'Unknown'
+            # Extract teams from table row structure
+            teams_data = self.extract_table_teams(block_html)
+            if teams_data:
+                match_data.update(teams_data)
+            else:
+                # Fallback to URL parsing
+                url_data = self.parse_match_path(match_path)
+                if url_data:
+                    match_data.update(url_data)
+            return match_data if match_data.get('teams') else None
+        except Exception as e:
+            if self.debug:
+                print(f"?? Error parsing table row: {e}")
+            return None
+    def parse_match_path(self, match_path):
+        """Extract info from match URL path"""
+        try:
+            path_parts = match_path.strip('/').split('/')
+            if len(path_parts) < 2:
+                return None
+            match_part = path_parts[1]
+            parts = match_part.rsplit('-', 1)
+            if len(parts) < 2:
+                return None
+            name_sport_part = parts[0]
+            # Find sport - safe pattern
+            pattern = r'-([a-zA-Z]+)$'
+            sport_match = re.search(pattern, name_sport_part)
+            sport = sport_match.group(1) if sport_match else 'Sports'
+            # Remove sport to get team names
+            remove_pattern = r'-[a-zA-Z]+$'
+            team_part = re.sub(remove_pattern, '', name_sport_part)
+            if '-vs-' in team_part:
+                team_names = team_part.split('-vs-')
+                if len(team_names) == 2:
+                    home_team = team_names[0].replace('-', ' ').title()
+                    away_team = team_names[1].replace('-', ' ').title()
+                    return {
+                        'match_path': match_path,
+                        'league': sport.title(),
+                        'teams': {
+                            'home': {'name': home_team},
+                            'away': {'name': away_team}
+                        },
+                        'match_type': 'vs',
+                        'status': 'Upcoming'
+                    }
+            return None
+        except Exception as e:
+            if self.debug:
+                print(f"Error parsing match path: {e}")
+            return None
+    def extract_slide_teams(self, block_html):
+        """Extract teams and scores from slide block"""
+        teams_data = {}
+        # Pattern to find club containers
+        club_pattern = r'<div class="club">(.*?)</div>'
+        clubs = re.findall(club_pattern, block_html, re.DOTALL)
+        if len(clubs) >= 2:
+            home_team = self.extract_slide_team_info(clubs[0])
+            away_team = self.extract_slide_team_info(clubs[1])
+            if home_team and away_team:
+                teams_data['teams'] = {'home': home_team, 'away': away_team}
+                teams_data['match_type'] = 'vs'
+                return teams_data
+        return teams_data
+    def extract_table_teams(self, block_html):
+        """Extract teams from table row structure"""
+        teams_data = {}
+        # Look for club-item pattern (different from slider)
+        club_pattern = r'<div class="club-item[^"]*">(.*?)</div>'
+        clubs = re.findall(club_pattern, block_html, re.DOTALL)
+        if len(clubs) >= 2:
+            home_team = self.extract_table_team_info(clubs[0])
+            away_team = self.extract_table_team_info(clubs[1])
+            if home_team and away_team:
+                teams_data['teams'] = {'home': home_team, 'away': away_team}
+                teams_data['match_type'] = 'vs'
+                return teams_data
+        return teams_data
+    def extract_slide_team_info(self, club_html):
+        """Extract team info from slide club block"""
+        team_info = {}
+        # Team name from club-name span
+        name_match = re.search(r'<div class="club-name text-overflow">\s*([^<]+)\s*</div>', club_html)
+        if name_match:
+            team_info['name'] = name_match.group(1).strip()
+        # Score from score span
+        score_match = re.search(r'<span class="score">(\d+)</span>', club_html)
+        if score_match:
+            team_info['score'] = score_match.group(1)
+        # Fallback: extract team name from img alt attribute
+        if not team_info.get('name'):
+            alt_match = re.search(r'<img[^>]*alt="([^"]+)"[^>]*>', club_html)
+            if alt_match:
+                full_name = alt_match.group(1).strip()
+                team_info['name'] = full_name[:3].upper()
+        return team_info if team_info.get('name') else None
+    def extract_table_team_info(self, club_html):
+        """Extract team info from table club block"""
+        team_info = {}
+        # Team name from club-name div
+        name_match = re.search(r'<div class="club-name[^"]*"[^>]*>\s*([^<]+)\s*</div>', club_html)
+        if name_match:
+            team_info['name'] = name_match.group(1).strip()
+        # Score from b-text-dark span
+        score_match = re.search(r'<span class="b-text-dark">(\d+)</span>', club_html)
+        if score_match:
+            team_info['score'] = score_match.group(1)
+        return team_info if team_info.get('name') else None
+    def determine_match_status(self, match_data):
+        """Determine match status based on available data"""
+        if match_data.get('teams'):
+            home_score = match_data['teams']['home'].get('score')
+            away_score = match_data['teams']['away'].get('score')
+            if home_score and away_score:
+                if int(home_score) > 0 or int(away_score) > 0:
+                    match_data['status'] = 'Live'  # Has non-zero scores
+                else:
+                    match_data['status'] = 'Upcoming'  # 0-0 likely upcoming
+            else:
+                match_data['status'] = 'Upcoming'  # No scores = upcoming
+        else:
+            match_data['status'] = 'Upcoming'  # Default
+    def generate_display_name(self, match_data):
+        """Generate display name"""
+        if match_data.get('match_type') == 'vs' and match_data.get('teams'):
+            home = match_data['teams']['home']['name']
+            away = match_data['teams']['away']['name']
+            return f"{home} vs {away}"
+        elif match_data.get('event_name'):
+            return match_data['event_name']
+        return "Unknown Event"
+    def extract_channels_data(self):
+        """Extract TV channels data and generate mappings"""
+        if not self.channels_html:
+            print("?? No channels data to process")
+            return []
+        print("?? Extracting TV channels data...")
+        channels = []
+        # Pattern for channels
+        channel_pattern = r'<div class="item-channel"[^>]*?data-id\s*=\s*"([^"]*)"[^>]*?data-link="([^"]*)"[^>]*?data-logo="([^"]*)"[^>]*?title="([^"]*)"[^>]*?>'
+        channel_matches = re.findall(channel_pattern, self.channels_html, re.DOTALL)
+        print(f"?? Found {len(channel_matches)} TV channels")
+        for data_id, data_link, data_logo, title in channel_matches:
+            channel_data = self.parse_channel_data(data_id, data_link, data_logo, title)
+            if channel_data:
+                # Generate STABLE encoded ID using base64 of data_id
+                encoded_id = base64.urlsafe_b64encode(data_id.encode()).decode().rstrip('=')
+                channel_data['encoded_id'] = encoded_id
+                # Add to mappings
+                self.channel_mappings[encoded_id] = {
+                    'fstv_data_id': data_id,
+                    'original_stream_url': data_link,
+                    'type': 'channel',
+                    'name': channel_data['name'],
+                    'clean_name': channel_data['clean_name'],
+                    'logo': data_logo,
+                    'category': channel_data['category'],
+                    'country': channel_data['country'],
+                    'created_at': datetime.now(timezone.utc).isoformat()
+                }
+                channels.append(channel_data)
+                self.stats["channels_found"] += 1
+                self.stats["encoded_urls_generated"] += 1
+        print(f"? Processed {len(channels)} channels with stable IDs")
+        return channels
+    def parse_channel_data(self, data_id, data_link, data_logo, title):
+        """Parse channel data"""
+        try:
+            channel_name = unescape(title.strip())
+            if not channel_name or not data_link:
+                return None
+            channel_info = self.categorize_channel(channel_name)
+            return {
+                'id': data_id,
+                'name': channel_name,
+                'logo': data_logo,
+                'original_stream_url': data_link,
+                'category': channel_info['category'],
+                'country': channel_info['country'],
+                'clean_name': self.clean_channel_name(channel_name)
+            }
+        except Exception as e:
+            if self.debug:
+                print(f"?? Error parsing channel: {e}")
+            return None
+    def categorize_channel(self, channel_name):
+        """Categorize channel"""
+        name_lower = channel_name.lower()
+        # Country
+        if any(x in name_lower for x in ['usa', 'cbs', 'nbc', 'abc', 'fox', 'espn']):
+            country = "USA"
+        elif any(x in name_lower for x in ['uk', 'itv', 'bbc', 'sky']):
+            country = "UK"
+        else:
+            country = "International"
+        # Category
+        if any(x in name_lower for x in ['sport', 'espn', 'fox sports', 'nfl', 'nba']):
+            category = "Sports"
+        elif any(x in name_lower for x in ['news', 'cnn', 'fox news', 'msnbc']):
+            category = "News"
+        else:
+            category = "Entertainment"
+        return {'category': category, 'country': country}
+    def clean_channel_name(self, name):
+        """Clean channel name"""
+        clean = re.sub(r'[^\w\s\-]', '', name)
+        clean = re.sub(r'\s+', '-', clean.strip())
+        clean = clean.lower()
+        clean = re.sub(r'^(ve-|cdn-|3uk-)', '', clean)
+        clean = re.sub(r'(-sv\d+|-\(sv\d+\))$', '', clean)
+        return clean
+    def generate_files(self, matches, channels):
+        """Generate all M3U and XML files"""
+        print("?? Generating playlist and EPG files...")
+        # Ensure directories exist
+        os.makedirs(self.mappings_dir, exist_ok=True)
+        os.makedirs(self.playlists_dir, exist_ok=True)
+        # Save mappings
+        self.save_mappings()
+        # Generate matches M3U and EPG
+        if matches:
+            self.generate_matches_files(matches)
+        # Generate channels M3U
+        if channels:
+            self.generate_channels_files(channels)
+        print(f"? Generated {self.stats['files_generated']} files")
+    def save_mappings(self):
+        """Save URL mappings to JSON files"""
+        # Save matches mappings
+        if self.match_mappings:
+            matches_file = os.path.join(self.mappings_dir, "url_mappings_matches.json")
+            with open(matches_file, 'w', encoding='utf-8') as f:
+                json.dump(self.match_mappings, f, indent=2, ensure_ascii=False)
+            print(f"? Saved {len(self.match_mappings)} match mappings")
+            self.stats["files_generated"] += 1
+        # Save channels mappings
+        if self.channel_mappings:
+            channels_file = os.path.join(self.mappings_dir, "url_mappings_channels.json")
+            with open(channels_file, 'w', encoding='utf-8') as f:
+                json.dump(self.channel_mappings, f, indent=2, ensure_ascii=False)
+            print(f"? Saved {len(self.channel_mappings)} channel mappings")
+            self.stats["files_generated"] += 1
+    def generate_matches_files(self, matches):
+        """Generate matches M3U and EPG files"""
+        # Generate M3U
+        m3u_content = self.generate_matches_m3u(matches)
+        m3u_file = os.path.join(self.playlists_dir, "fstv_matches_encoded.m3u")
+        with open(m3u_file, 'w', encoding='utf-8') as f:
+            f.write(m3u_content)
+        print(f"? Saved matches M3U: {len(matches)} channels")
+        self.stats["files_generated"] += 1
+        # Generate EPG
+        epg_xml = self.generate_matches_epg(matches)
+        epg_file = os.path.join(self.playlists_dir, "fstv_matches_encoded.xml")
+        tree = ET.ElementTree(epg_xml)
+        ET.indent(tree, space="  ", level=0)
+        tree.write(epg_file, encoding="utf-8", xml_declaration=True)
+        print(f"? Saved matches EPG")
+        self.stats["files_generated"] += 1
+    def generate_matches_m3u(self, matches):
+        """Generate matches M3U content"""
+        m3u_content = f'#EXTM3U url-tvg="{self.proxy_server}/epg/matches.xml"\n'
+        channel_number = 3000
+        for match in matches:
+            # Generate channel info
+            if match.get('match_type') == 'vs' and match.get('teams'):
+                home_team = match['teams']['home']['name']
+                away_team = match['teams']['away']['name']
+                home_score = match['teams']['home'].get('score')
+                away_score = match['teams']['away'].get('score')
+                if match.get('status') == 'Live' and home_score and away_score:
+                    channel_name = f"?? LIVE: {home_team} {home_score} - {away_score} {away_team}"
+                elif match.get('status') == 'Live':
+                    channel_name = f"?? LIVE: {home_team} vs {away_team}"
+                elif match.get('status') == 'FT' and home_score and away_score:
+                    channel_name = f"Final: {home_team} {home_score} - {away_score} {away_team}"
+                else:
+                    channel_name = f"{home_team} vs {away_team}"
+            else:
+                event_name = match.get('event_name', 'Unknown Event')
+                if match.get('status') == 'Live':
+                    channel_name = f"?? LIVE: {event_name}"
+                else:
+                    channel_name = event_name
+            # Clean and encode
+            channel_name_clean = re.sub(r'[^\w\s\-\(\):]', '', channel_name).strip()
+            encoded_url = f"{self.proxy_server}/match/{match['encoded_id']}.m3u8"
+            tvg_id = f"fstv_{match['encoded_id']}"
+            league = match.get('league', 'FSTV Sports')
+            group_title = f"FSTV - {league}"
+            m3u_content += f'#EXTINF:-1 tvg-chno="{channel_number}" tvg-id="{tvg_id}" '
+            m3u_content += f'tvg-name="{channel_name_clean}" tvg-logo="https://www.pngall.com/wp-content/uploads/1/Sports-PNG-Image.png" '
+            m3u_content += f'group-title="{group_title}",{channel_name_clean}\n'
+            m3u_content += f"{encoded_url}\n"
+            channel_number += 1
+        return m3u_content
+    def format_time_for_canadian_zones(self, dt):
+        """Format datetime for Canadian time zones"""
+        if not dt:
+            return "TBD"
+        # Canadian time zones
+        pst = pytz.timezone('US/Pacific')
+        cst = pytz.timezone('US/Central')
+        est = pytz.timezone('US/Eastern')
+        pst_time = dt.astimezone(pst).strftime("%I:%M %p")
+        cst_time = dt.astimezone(cst).strftime("%I:%M %p")
+        est_time = dt.astimezone(est).strftime("%I:%M %p")
+        return f"{est_time} EST / {cst_time} CST / {pst_time} PST"
+    def generate_matches_epg(self, matches):
+        """Generate matches EPG/XMLTV - Timmys Format"""
+        tv = ET.Element("tv")
+        tv.set("generator-info-name", "FSTV Combined Scraper")
+        tv.set("generator-info-url", "https://fstv.space")
+        # Timmys format: 36 hours starting 2 hours before scrape time
+        current_time = datetime.now(self.utc)
+        epg_start_time = current_time - timedelta(hours=2)
+        epg_end_time = epg_start_time + timedelta(hours=36)
+        for match in matches:
+            # Channel definition
+            if match.get('match_type') == 'vs' and match.get('teams'):
+                home_team = match['teams']['home']['name']
+                away_team = match['teams']['away']['name']
+                channel_display = f"{home_team} vs {away_team}"
+            else:
+                channel_display = match.get('event_name', 'Unknown Event')
+            tvg_id = f"fstv_{match['encoded_id']}"
+            channel = ET.SubElement(tv, "channel")
+            channel.set("id", tvg_id)
+            display_name = ET.SubElement(channel, "display-name")
+            display_name.text = re.sub(r'[^\w\s\-\(\):]', '', channel_display).strip()
+            # Programme blocks - Timmys format
+            self.generate_match_programmes(tv, tvg_id, match, epg_start_time, epg_end_time)
+        return tv
+    def generate_match_programmes(self, tv_element, channel_id, match, epg_start_time, epg_end_time):
+        """Generate EPG programme blocks for a match - Timmys Format"""
+        match_time = match.get('timestamp')
+        current_time = datetime.now(self.utc)
+        # Generate team names for title
+        if match.get('match_type') == 'vs' and match.get('teams'):
+            home_team = match['teams']['home']['name']
+            away_team = match['teams']['away']['name']
+            event_title = f"{home_team} vs {away_team}"
+        else:
+            event_title = match.get('event_name', 'Unknown Event')
+        # Simple time-based logic (ignore status, use only timestamp)
+        if match_time:
+            live_end_time = match_time + timedelta(hours=3)
+            # UPCOMING EVENT blocks (2-hour blocks before start time)
+            if current_time < match_time:
+                block_start = max(epg_start_time, current_time)
+                while block_start < match_time and block_start < epg_end_time:
+                    block_end = min(block_start + timedelta(hours=2), match_time, epg_end_time)
+                    programme = ET.SubElement(tv_element, "programme")
+                    programme.set("start", block_start.strftime("%Y%m%d%H%M%S +0000"))
+                    programme.set("stop", block_end.strftime("%Y%m%d%H%M%S +0000"))
+                    programme.set("channel", channel_id)
+                    title = ET.SubElement(programme, "title")
+                    title.text = f"UPCOMING EVENT: {event_title}"
+                    desc = ET.SubElement(programme, "desc")
+                    time_str = self.format_time_for_canadian_zones(match_time)
+                    desc.text = f"Event starts at {time_str}. Tune in to watch {event_title} live!"
+                    category_elem = ET.SubElement(programme, "category")
+                    category_elem.text = "Sports"
+                    block_start = block_end
+            # EVENT NOW LIVE block (3-hour block during event)
+            if current_time >= match_time and current_time < live_end_time:
+                live_start = max(match_time, epg_start_time)
+                live_stop = min(live_end_time, epg_end_time)
+                if live_start < epg_end_time:
+                    programme = ET.SubElement(tv_element, "programme")
+                    programme.set("start", live_start.strftime("%Y%m%d%H%M%S +0000"))
+                    programme.set("stop", live_stop.strftime("%Y%m%d%H%M%S +0000"))
+                    programme.set("channel", channel_id)
+                    title = ET.SubElement(programme, "title")
+                    title.text = f"?? EVENT NOW LIVE FOR: {event_title}"
+                    desc = ET.SubElement(programme, "desc")
+                    desc.text = f"Live coverage of {event_title}. Watch all the action as it happens!"
+                    category_elem = ET.SubElement(programme, "category")
+                    category_elem.text = "Sports"
+            # EVENT HAS ENDED block (after live+3 hours)
+            if current_time >= live_end_time:
+                ended_start = max(live_end_time, epg_start_time)
+                ended_stop = epg_end_time
+                if ended_start < epg_end_time:
+                    programme = ET.SubElement(tv_element, "programme")
+                    programme.set("start", ended_start.strftime("%Y%m%d%H%M%S +0000"))
+                    programme.set("stop", ended_stop.strftime("%Y%m%d%H%M%S +0000"))
+                    programme.set("channel", channel_id)
+                    title = ET.SubElement(programme, "title")
+                    title.text = f"EVENT HAS ENDED: {event_title}"
+                    desc = ET.SubElement(programme, "desc")
+                    desc.text = f"Event has ended: {event_title}"
+                    category_elem = ET.SubElement(programme, "category")
+                    category_elem.text = "Sports"
+        else:
+            # No timestamp available - fill with default programming
+            programme = ET.SubElement(tv_element, "programme")
+            programme.set("start", epg_start_time.strftime("%Y%m%d%H%M%S +0000"))
+            programme.set("stop", epg_end_time.strftime("%Y%m%d%H%M%S +0000"))
+            programme.set("channel", channel_id)
+            title = ET.SubElement(programme, "title")
+            title.text = event_title
+            desc = ET.SubElement(programme, "desc")
+            desc.text = f"Sports programming: {event_title}"
+            category_elem = ET.SubElement(programme, "category")
+            category_elem.text = "Sports"
+    def generate_match_title(self, match_data):
+        """Generate match title - Timmys Format"""
+        match_time = match_data.get('timestamp')
+        current_time = datetime.now(self.utc)
+        if match_data.get('match_type') == 'vs' and match_data.get('teams'):
+            home_team = match_data['teams']['home']['name']
+            away_team = match_data['teams']['away']['name']
+            event_title = f"{home_team} vs {away_team}"
+        else:
+            event_title = match_data.get('event_name', 'Unknown Event')
+        if match_time:
+            live_end_time = match_time + timedelta(hours=3)
+            if current_time < match_time:
+                return f"UPCOMING EVENT: {event_title}"
+            elif current_time >= match_time and current_time < live_end_time:
+                return f"?? EVENT NOW LIVE FOR: {event_title}"
+            else:
+                return f"EVENT HAS ENDED: {event_title}"
+        return event_title
+    def generate_match_description(self, match_data):
+        """Generate match description - Timmys Format"""
+        match_time = match_data.get('timestamp')
+        current_time = datetime.now(self.utc)
+        league = match_data.get('league', 'Unknown League')
+        if match_data.get('match_type') == 'vs' and match_data.get('teams'):
+            home_team = match_data['teams']['home']['name']
+            away_team = match_data['teams']['away']['name']
+            event_title = f"{home_team} vs {away_team}"
+        else:
+            event_title = match_data.get('event_name', 'Unknown Event')
+        if match_time:
+            live_end_time = match_time + timedelta(hours=3)
+            time_str = self.format_time_for_canadian_zones(match_time)
+            if current_time < match_time:
+                return f"Event starts at {time_str}. Tune in to watch {event_title} live in {league}!"
+            elif current_time >= match_time and current_time < live_end_time:
+                return f"Live coverage of {event_title} in {league}. Watch all the action as it happens!"
+            else:
+                return f"Event has ended: {event_title} in {league}."
+        return f"Sports programming: {event_title} in {league}."
+    def generate_channels_files(self, channels):
+        """Generate channels M3U file"""
+        # Generate M3U
+        m3u_content = self.generate_channels_m3u(channels)
+        m3u_file = os.path.join(self.playlists_dir, "fstv_tv_channels_encoded.m3u")
+        with open(m3u_file, 'w', encoding='utf-8') as f:
+            f.write(m3u_content)
+        print(f"? Saved channels M3U: {len(channels)} channels")
+        self.stats["files_generated"] += 1
+    def generate_channels_m3u(self, channels):
+        """Generate channels M3U content"""
+        # Sort channels
+        sorted_channels = sorted(channels, key=lambda x: (x['country'], x['category'], x['name']))
+        m3u_content = '#EXTM3U url-tvg=""\n'
+        channel_number = 1000
+        for channel in sorted_channels:
+            display_name = self.clean_display_name(channel['name'])
+            encoded_url = f"{self.proxy_server}/channel/{channel['encoded_id']}.m3u8"
+            tvg_id = f"fstv_tv_{channel['encoded_id']}"
+            group_title = f"FSTV TV - {channel['country']} {channel['category']}"
+            m3u_content += f'#EXTINF:-1 tvg-chno="{channel_number}" tvg-id="{tvg_id}" '
+            m3u_content += f'tvg-name="{display_name}" tvg-logo="{channel["logo"]}" '
+            m3u_content += f'group-title="{group_title}",{display_name}\n'
+            m3u_content += f"{encoded_url}\n"
+            channel_number += 1
+        return m3u_content
+    def clean_display_name(self, name):
+        """Clean channel name for display"""
+        display = re.sub(r'^(VE-|CDN-|3uk-)', '', name, flags=re.IGNORECASE)
+        display = re.sub(r'\s*\(sv\d+\)\s*$', '', display, flags=re.IGNORECASE)
+        display = re.sub(r'\s*-\s*sv\d+\s*$', '', display, flags=re.IGNORECASE)
+        display = display.strip()
+        # Special cases
+        replacements = {
+            'usanetwork': 'USA Network',
+            'cbssport': 'CBS Sports Network',
+            'cbs los angeles': 'CBS Los Angeles',
+            'itv1': 'ITV1', 'itv2': 'ITV2', 'itv3': 'ITV3', 'itv4': 'ITV4',
+            'lfctv': 'Liverpool FC TV'
+        }
+        display_lower = display.lower()
+        for key, value in replacements.items():
+            if key in display_lower:
+                display = value
+                break
+        return display
+    def print_stats(self):
+        """Print final statistics"""
+        print(f"\n?? SCRAPING COMPLETE:")
+        print(f"   ?? Matches found: {self.stats['matches_found']}")
+        print(f"   ?? Channels found: {self.stats['channels_found']}")
+        print(f"   ?? Encoded URLs generated: {self.stats['encoded_urls_generated']}")
+        print(f"   ?? Files generated: {self.stats['files_generated']}")
+        print(f"   ?? HTTP requests made: {self.stats['http_requests']}")
+    async def run(self):
+        """Main scraper execution"""
+        print("?? FSTV Combined Scraper Starting")
+        print("? Scheduled for 12:05 AM daily execution")
+        print("=" * 50)
+        try:
+            # Initialize HTTP client
+            await self.init_http_client()
+            # Fetch LIVE data from FSTV
+            if not await self.fetch_live_data():
+                print("? Failed to fetch live data from FSTV. Exiting.")
+                return False
+            # Extract data
+            matches = self.extract_matches_data()
+            channels = self.extract_channels_data()
+            if not matches and not channels:
+                print("?? No data extracted. Exiting.")
+                return False
+            # Generate files
+            self.generate_files(matches, channels)
+            # Print stats
+            self.print_stats()
+            print("\n?? Scraping completed successfully!")
+            return True
+        except Exception as e:
+            print(f"? Scraper error: {e}")
+            return False
+        finally:
+            # Clean up HTTP client
+            await self.close_http_client()
+async def main():
+    """Entry point"""
+    scraper = CombinedFSTVScraper(debug=True)
+    success = await scraper.run()
+    exit(0 if success else 1)
+if __name__ == "__main__":
+    asyncio.run(main())