about summary refs log tree commit diff
path: root/youtube_dl/extractor
Commit message (Collapse)AuthorAgeFilesLines
* [YouPorn] Add playlist extractorsdirkf2024-04-222-1/+447
| | | | | | | | | * YouPornCategoryIE * YouPornChannelIE * YouPornCollectionIE * YouPornStarIE * YouPornTagIE * YouPornVideosIE,
* [YouPorn] Improve extractiondirkf2024-04-221-18/+46
| | | | | | | * detect unwatchable videos * improve duration extraction * fix count extraction and support large values * detect and remove SEO spam boilerplate description
* [YouPorn] Incorporate yt-dlp PR 8827dirkf2024-04-221-38/+80
| | | | | | * from https://github.com/yt-dlp/yt-dlp/pull/8827 * extract from webpage instead of broken API URL * thx The-MAGI
* [Youtube] Fix unwanted private method __ie_msg in f8b0135850gy-chen2024-03-231-3/+3
| | | | Fixes `AttributeError no attribute '_YoutubeIE__ie_msg'` if unable to decode n-parameter
* [Vimeo] Improve `config` extraction (#32742)Zizheng Guo2024-03-121-2/+2
| | | * update for more robust json parsing
* [Videa] Fix extractionhatsomatt2024-03-081-1/+4
| | | | | | | | | * update API URL * from https://github.com/yt-dlp/yt-dlp/pull/8003 * thanks to the authors! Closes yt-dlp/7427 Authored by: hatsomatt, aky-01
* [Videa] Align with yt-dlpdirkf2024-03-081-13/+26
|
* [XFileShare] Update extractor for 2024dirkf2024-03-083-147/+193
| | | | | | | | | * simplify aa_decode() * review and update supported sites and tests * in above, include FileMoon.sx, and remove separate module * incorporate changes from yt-dlp * allow for decoding multiple scripts (eg, FileMoon) * use new JWPlayer extraction
* [InfoExtractor] Rework and improve JWPlayer extractiondirkf2024-03-081-33/+22
| | | | | | * use traverse_obj() and _search_json() * support playlist `.load({**video1},{**video2}, ...)` * support transform_source=... for _extract_jwplayer_data()
* [InfoExtractor] Add `_search_json()`dirkf2024-03-081-0/+55
| | | | | * uses the error diagnostic to truncate the JSON string * may be confused by non-C-Pythons
* [caffeine.tv] Add new extractor (#32514)Aaron Tan2024-02-222-0/+80
| | | | | | | * Add CaffeineTVIE info extractor to support site caffeine.tv --------- Co-authored-by: dirkf <fieldhouse@gmx.net>
* [GBNews]Add new extractor for GB News TV channel (#29432)dirkf2024-02-222-0/+140
| | | | | | | | * Add extractor for GB News TV channel * Support more GBNews URL formats Allow alphanumeric and _ in place of `shows`, which redirect to site's preferred URL * Update for 2024
* [Vbox7] Improve extraction, adding features from yt-dlp PR #9100dirkf2024-02-191-27/+53
| | | | | | | | | * changes from https://github.com/yt-dlp/yt-dlp/pull/9100 (thx seproDev): - attempt HLS extraction - re-enable XFF - test `view_count`, `duration` extraction * improve commenting, error checks
* [Vbox7IE] Sanitise ld+json containing unexpected charactersdirkf2024-02-021-0/+22
| | | | | | * based on PR #29680 * added hack to force invoking `transform_source` * fixes #26218
* [Vbox7IE] Improve extractiondirkf2024-02-021-39/+90
| | | | | | | | * DASH extraction no longer fails with new range support * but always find combined formats if available * suppress ineffective XFF geo-bypass (causes time-outs) * adapted from https://github.com/ytdl-org/youtube-dl/pull/29680 * thx former GH user kikuyan
* [InfoExtractor] Correctly resolve BaseURL in DASH manifestdirkf2024-02-021-2/+19
| | | | | | Specs: * ISO/IEC 23009-1:2012 section 5.6 * RFC 3986 section 5.
* [InfoExtractor] Support byte range for DASHdirkf2024-02-021-36/+78
| | | | | * adapted from https://github.com/ytdl-org/youtube-dl/pull/30279 * thx former GH user kikuyan
* [InfoExtractor] Support DASH subtitle extraction (yt-dlp back-port)dirkf2024-02-021-127/+199
|
* [YouTube] Fix `like_count` extraction using `likeButtonViewModel`dirkf2024-01-221-4/+14
| | | | | * also fix various tests * TODO: check against yt-dlp tests
* [YouTube] Rework n-sig processing, realigning with yt-dlpdirkf2024-01-221-185/+289
| | | | * apply n-sig before chunked fragments, fixes #32692
* [InfoExtractor] Support some warning and `._downloader` shortcut methods ↵dirkf2024-01-221-3/+53
| | | | from yt-dlp
* [Epidemic Sound] Add new extractor (#32628)Robotix2023-12-062-0/+102
| | | | | | | | | * Add simple extractor * Support separate tracks * Use index as id instead of slug --------- Co-authored-by: dirkf <fieldhouse@gmx.net>
* [Imgur] Overhaul extractor module (#32612)dirkf2023-12-051-69/+279
| | | Revise extractors for new API and page formats
* [telewebion] Fix extraction (#32634)mimvahedi2023-12-021-24/+23
| | | | | | | | | * [telewebion] fix extraction Resolves https://github.com/ytdl-org/youtube-dl/issues/5135#issuecomment-932952119 --------- Co-authored-by: dirkf <fieldhouse@gmx.net>
* [Youtube] Update consent cookie handling to match siteReenigneArcher2023-11-291-10/+4
| | | | | Apologies for force push! [skip ci]
* [S4C] Add thumbnail extraction, extract series as playlistdirkf2023-08-312-8/+59
| | | | Based on https://github.com/yt-dlp/yt-dlp/pull/7776: thx ifan-t, bashonly
* [S4C] Add extractor for Sianel Pedwar Cymrudirkf2023-08-042-0/+77
| | | | * from https://github.com/yt-dlp/yt-dlp/pull/7730, thx ifan-t, bashonly
* [compat] Use `compat_open()`dirkf2023-07-252-0/+2
|
* [InfoExtractor] Add `_match_valid_url()` class method and refactordirkf2023-07-192-19/+38
| | | | | | | * API compatible with yt-dlp * also support Sequence of patterns in _VALID_URL * one place to compile _VALID_URL * TODO: remove existing extractor shims
* [InfoExtractor] Add search methods for Next/Nuxt.js from yt-dlpdirkf2023-07-194-53/+62
| | | | | | | | | | | | * add _search_nextjs_data(), from https://github.com/yt-dlp/yt-dlp/pull/1386 thanks selfisekai * add _search_nuxt_data(), from https://github.com/yt-dlp/yt-dlp/pull/1921, thanks Lesmiscore, pukkandan * add tests for the above * also fix HTML5 type recognition and tests, from https://github.com/yt-dlp/yt-dlp/commit/222a230871fe4fe63f35c49590379c9a77116819, thanks Lesmiscore * update extractors in PR using above, fix tests.
* [Clipchamp] Add new extractor back-ported from yt-dlpdirkf2023-07-192-0/+77
|
* [DLF] Add site extractors back-ported from yt-dlpdirkf2023-07-192-0/+208
| | | | * from https://github.com/yt-dlp/yt-dlp/pull/6697, thanks nick-cd
* [Whyp] Add extractor back-ported from yt-dlpdirkf2023-07-192-0/+79
| | | | * from https://github.com/yt-dlp/yt-dlp/pull/6803, thanks CoryTibbettsDev
* [GlobalPlayer] Add site extractors back-ported from yt-dlpdirkf2023-07-192-4/+296
| | | | * from https://github.com/yt-dlp/yt-dlp/pull/6903, thanks garret1317
* [InfoExtractor] Support groups in _`search_regex()`, etcdirkf2023-07-191-4/+5
|
* [YouTube] Avoid crash in author extractiondirkf2023-06-221-1/+1
|
* [YouTube] Improve nsig function name extractionpukkandan2023-06-221-6/+13
| | | | | | | Fixes player b7910ca8, using `,` vs `;` See https://github.com/ytdl-org/youtube-dl/issues/32292#issuecomment-1602231170 Co-authored-by: dirkf
* [YouTube] Improve fix for ae8ba2cdirkf2023-06-181-3/+1
| | | | Thx: https://github.com/yt-dlp/yt-dlp/commit/01aba25
* [YouTube] Fix `KeyError QV` in signature extraction faileddirkf2023-06-171-1/+5
| | | | | | * temporarily force missing global definition into sig JS * improve test: thanks https://github.com/yt-dlp/yt-dlp/issues/7327#issuecomment-1595274615 * resolves #32314
* [ITV] Fix UA capitalisation in 384f632dirkf2023-05-231-2/+2
|
* [YouTube] Support Releases tabdirkf2023-04-231-47/+67
|
* [YouTube] Simplify signature patternsdirkf2023-04-121-5/+3
|
* [extractor/youtube] Bypass throttling for `-f17`pukkandan2023-03-191-9/+4
| | | | | | | | and related cleanup Thanks @AudricV for the finding Ref: yt-dlp/yt-dlp/commit/c9abebb
* [extractor/youtube] Construct fragment list lazilypukkandan2023-03-191-6/+12
| | | | | Ref: yt-dlp/yt-dlp/commit/e389d17 See: yt-dlp/yt-dlp#6517
* [AENetworksBaseIE] Report missing show data instead of crashdirkf2023-03-141-5/+18
|
* [Youtube] Construct dash formats with `range` querypukkandan2023-03-031-6/+16
| | | | See yt-dlp/yt_dlp#6369
* [YouTube] Support @owner format in uploader_id etcdirkf2023-02-241-125/+194
| | | | | | * implement https://github.com/ytdl-org/youtube-dl/issues/31530#issuecomment-1435734719 * update affected tests * misc clean-ups
* Escape URLs in `sanitized_Request`, not `sanitize_url` ↵pukkandan2023-02-201-0/+19
| | | | d2558234cf5dd12d6896eed5427b7dcdb3ab7b5a added escaping of URLs while sanitizing. However, `sanitize_url` may not always receive an actual URL. Eg: When using `youtube-dl "search query" --default-search ytsearch`, `search query` gets escaped to `search%20query` before being prefixed with `ytsearch:` which is not the intended behavior. So the escaping is moved to `sanitized_Request` instead.
* [Vimeo] Fix e19ec52 for tween-age Pythonsdf2023-02-201-1/+1
| | | | | | | * a check in older Pythons in the 2.7 and earlier, 3.3, 3.4 series caused "sre_constants.error: nothing to repeat" * satisfy the check by avoiding nested qualifiers that can match empty string Resolves #31597
* [YouTube] Avoid crash if uploader_id extraction failsdirkf2023-02-171-1/+3
| | | See #31530.