Recent Posts

Pages: 1 [2] 3 4 5 6 7 ... 10
11
This will work most probably, but I think it is fragile too... I am sure soon it will be not possible again, but maybe something new will come up...



Quote
This will work most probably, but I think it is fragile too... I am sure soon it will be not possible again, but maybe something new will come up...

Yes — what you currently have is a classic Selenium workflow:
It spins up a new ChromeDriver instance for each URL.
It applies stealth tweaks, sets cookies, navigates, clicks “See more” buttons, saves HTML, and then quits.
You’re running this in parallel threads (ThreadPoolExecutor(max_workers=4)), so four fresh Chrome sessions at a time.
That’s exactly the pattern IMDb’s AWS WAF is now blocking: fresh, headless, parallel, automation‑fingerprinted sessions.

Why this fails against IMDb now
Headless mode is fingerprinted. Even with stealth, AWS WAF detects it.
Fresh profiles per run (no browsing history, no persistent cookies) scream “bot.”
Parallel sessions look like automation, not human browsing.
Adding fake cookies doesn’t help — WAF requires valid tokens generated by a real browser session.
How to adapt this code to “attach to Chrome”
Instead of creating a new webdriver.Chrome(...) each time, you’d:

Start Chrome manually with debugging enabled

chrome.exe --remote-debugging-port=9222 --user-data-dir="C:\ChromeIMDb"
This opens Chrome with a persistent profile (C:\ChromeIMDb).
You log in once, build up cookies/history naturally.
Change your Selenium init code
Replace:

service = Service(chrome_path)
chrome_options = build_chrome_options(headed=False)
driver = webdriver.Chrome(service=service, options=chrome_options)
With:

service = Service(chrome_path)
chrome_options = webdriver.ChromeOptions()
chrome_options.add_experimental_option("debuggerAddress", "127.0.0.1:9222")
driver = webdriver.Chrome(service=service, options=chrome_options)
Now Selenium attaches to the Chrome you already opened.
It inherits your cookies, extensions, and fingerprint.
Run serially, not in parallel
IMDb WAF is sensitive to multiple simultaneous sessions.
Change ThreadPoolExecutor(max_workers=4) → max_workers=1.

Important adjustments
Don’t quit the browser (driver.quit()) after each run — that would kill your attached Chrome. Instead, just close tabs (driver.close()) or reuse the same driver.
Remove fake cookie injection — you don’t need it if you’re using your real Chrome profile.
Headed mode only — you’ll see the browser window, but that’s what passes WAF.
In short: your current script is fine for FilmAffinity, but IMDb now requires either:

Attach to Chrome (reuse your real session), or
Switch to IMDb datasets / APIs for long‑term stability.
12
Just in case other people get this to fix it in Selenium_Chrome_Movie_Additional_pages_v4:

after driver.get(download_url)

i added:

time.sleep(random.uniform(8, 12))

This change does not work because it blocks the download of Additional pages.
13
Hello,

thank you for your epic work on the keeping the scripts and PVD alive.


After working very well for 3-4 days, today 22.01.2026 I keep getting on keywords, reviews pages download this:

Code: [Select]
<html lang="en"><head>
           "context":"
};
    </script>
    <script src="https://1c5c1ecf7303.8b78215a.eu-north-1.token.awswaf.com/1c5c1ecf7303/e231f0619a5e/0319a8d4ae69/challenge.js"></script>
</head>
<body>
    <div id="challenge-container"></div>
    <script type="text/javascript">
        AwsWafIntegration.saveReferrer();
        AwsWafIntegration.checkForceRefresh().then((forceRefresh) => {
            if (forceRefresh) {
                AwsWafIntegration.forceRefreshToken().then(() => {
                    window.location.reload(true);
                });
            } else {
                AwsWafIntegration.getToken().then(() => {
                    window.location.reload(true);
                });
            }
        });
    </script>
    <noscript>
        <h1>JavaScript is disabled</h1>
        In order to continue, we need to verify that you're not a robot.
        This requires JavaScript. Enable JavaScript and then reload the page.
    </noscript>

</body></html>

After some searching i got this from chatgpt:

What the error actually is the file you’re saving is not the keywords page. It’s an AWS WAF (Web Application Firewall) challenge page returned by IMDb

Key signs from the HTML:

challenge.js
AwsWafIntegration
“verify that you're not a robot”
JavaScript-based token refresh

This means:IMDb detected automation and served a bot-check page instead of real content

Just in case other people get this to fix it in Selenium_Chrome_Movie_Additional_pages_v4:

after driver.get(download_url)

i added:

time.sleep(random.uniform(8, 12))

14
144.0.7559.31 not 144.0.7559.59
And also, you don't need external sites. Nothing is parsed so far from external sites so far. It was placed there for possible use in the future.
15
Finally, here are all the scripts.


Never forget to read first message in the topic. All the answers and solutions are there, scripts and PVD to work flawlessly.


As usual, backup and empty Scripts folder and extract Scripts_2026-01-06.7z there. Extract the other file into PVD root folder.

If you want to use the scripts with my skin, you can download it with the list of custom fields here:

https://www.videodb.info/forum_en/index.php/topic,4388.msg23025.html#msg23025

Important note: Since I didn't see even "thanks", or any kind of feedback (except from Ivek, and I haven't seem him recently either) for a more than a year of hard work, I guess there is no interest for these, so I will not update scripts anymore. But anyway, given files are firm base someone else to take over and continue where I left. If I could do it with AI, anyone can.


Best regards.
Thank you for your support of the PVD. But I'm having trouble working with Selenium. I updated ChromeDriver (144.0.7559.59), updated Python (3.14.2). And still, I can't get information from the IMDB. The log file keeps showing no connection.
16
Thanks Ivek! Wish you a great health!

Thanks.
17
Thanks Ivek! Wish you a great health!
18
It is important to note that you must have the latest version of chromedriver.exe for this to work. Chromedriver always needs to be updated to the latest version, this is a prerequisite for all scripts to work.

Somehow, despite my health problems, I managed to check how the latest IMDb Movie, Allmovie and Rottentomatoes v4 Scripts work. They work fine (with some cosmetic errors when transferring information)  provided that you use the latest chromedriver.exe program in what I mentioned a little higher up.

I didn't check the other scripts.
19
It is important to note that you must have the latest version of chromedriver.exe for this to work. Chromedriver always needs to be updated to the latest version, this is a prerequisite for all scripts to work.
20
Thank you for all the effort put into creating all the scripts. As for me, I currently have quite a few health problems, so I am currently less present on the forum and currently because of this I am using the PVD program less or very little and testing scripts and the like.

My wish is that you would still help update all the scripts.

As for other users, I assume that some people find it difficult to use or install the python program on their computer, because they may also be less skilled in using such programs.

To clarify, I myself do not know many things about programming, because I am self-taught and have never had courses in using Windows and programming.
Pages: 1 [2] 3 4 5 6 7 ... 10
anything