Author Topic: PVD Selenium MOD v4 IMDb Movie, People and FilmAffinity Scripts  (Read 249 times)

0 Members and 1 Guest are viewing this topic.

Offline afrocuban

  • Moderator
  • *****
  • Posts: 623
    • View Profile

IF YOU DON'T READ THIS POST CAREFULLY AND FOLLOW EVERYTHING WRITTEN HERE, BUT JUST DOWNLOAD FILES, I COULD BET IT WILL NOT WORK FOR YOU AND YOU WILL COME BACK HERE ASKING QUESTIONS ALREADY ANSWERED IN THIS POST.


Almost 4 months after for the first time ever I heard the word "Selenium" knowing nothing about programming, I am finally bringing practically new PVD MOD considering amount of files and programs brought. It consists of the scripts and program as described here.

You now need only one script for IMDb movies, one for IMDb people and one for FilmAffinity movies for everything: search and download. Selenium scripts in the background are doing all "external" work, so in your PVD you have clean situation: 2 scripts and configurator for movies (plus .batch file for these 2 at your will), and one script and configurator for the people. Check the screenshots below.

I strongly suggest to rename your current "Scripts" folder to, for instance, "Scripts-Original", and to put this Scripts.7z in your PVD folder and extract it there. It will create "Scripts" folder with all the scripts and files needed for the PVD to work (as a bonus, I'm contributing source code for the Scripts Configurator program, as well as updated and polished UDL file for PVD scripts in Notepad++ that is just to be imported to Notepad++).  If you want to, after testing you can merge two folders, Selenium and non-Selenium scripts and files into "Scripts" folder.

Before that....
As stated here
ensure that:


Quote
A. You installed python
B. You installed selenium via cmd, with
Quote
pip install selenium

B. You installed requests via cmd, with
Quote
pip install requests

C. You have your Chrome bin on a PATH (to test this, open cmd and simply type "chrome" and check if Chrome opens).
D. You have Python folder on your PATH (to test this, open cmd and simply type "python --version" and check if got the proper feedback, for instance:
Quote
C:\Users\user>python --version
Python 3.12.6
E. pythonw.exe is not missing, or it's containing folder is on the PATH (to test this, open cmd and simply type "pythonw" and check if got the proper feedback, for instance:
Quote
C:\Users\user>pythonw

C:\Users\user> (empty output)


These scripts:


Quote
1. Use Chrome browser instead Firefox
2. Use chromedriver.exe instead geckodriver
3. Start chromedriver.exe silently
4. Silently invoke browser in a headless mode (no pop-up windows of browser)
5. Scrape .htm pages of a given urls
6. No path is needed to set manually inside the script - it is set to be relative to the path of selenium script!


You just use your PVD as ever, just be sure to extract as instructed above.

For using relative path, ensure:

Quote
6B. You put appropirate chromedriver.exe to the "Script" folder, too. There is no installation for chromedriver, just extract it from the .zip file into your "Scripts" folder described above. IMPORTANT!!!! You need to download chromedriver.exe of the same version as your Chrome browser. At the moment of this post, stable version is v134. You can find Crome browser download and appropriate chromedriver here. For example, for v134, Stable links are:
Chrome browser:

1. chrome   win32   https://storage.googleapis.com/chrome-for-testing-public/134.0.6998.88/win32/chrome-win32.zip
2. chrome   win64   https://storage.googleapis.com/chrome-for-testing-public/134.0.6998.88/win64/chrome-win64.zip
Chromedriver:

1. chromedriver   win32   https://storage.googleapis.com/chrome-for-testing-public/134.0.6998.88/win32/chromedriver-win32.zip
2. chromedriver   win64   https://storage.googleapis.com/chrome-for-testing-public/134.0.6998.88/win64/chromedriver-win64.zip

From this point on, everything is automated and headless, silent as never before.

Amount of data imported is huge! I have included dozens of new custom fields. IF YOU ARE DATA HUNGRY AS I AM, THE MOST DATA CAN BE COLLECTED IF YOU CHECK AND UNCHECK OPTIONS AS IN THE SCRIPTS CONFIGURATOR SCREENSHOTS BELOW. The updated table with all the fields in these 3 scripts can be found here. To see them all, you have to use Classic skin, or you have to add custom fields to your PVD and from there to create your own custom skin, or to use one of my skins from here, once I complete them and adjust them for the final Selenium v4 scripts.


Examine the table. That is the only way you will learn what fields comes from which movie/person page and if you want them or not. The less you want, the faster PVD will be.

Please feel free to test the scripts and give me a feedback if something doesn't work. When I say "it doesn't work", everything works, the only issue that can happen is that sites have changed html layout and again not all fileds are available, or you updated your Chrome automatically to a higher version and you didn't download and extract corespondent chromedriver version. The best indicator for this is that no .log file is created in "\Scripts\Tmp\" folder. I will update whatever you report in a month at most from the first report, to give us all the time to collect and report as many as possible issues.

What I have learned


On this long journey, what I have learned was how hard work coding is. Also, I had to learn pretty deep about Pascal/Delphi, about python, and most frustratingly - ahk! I had to revise all the scripts from the scratch several times. It was either because of the concepts i was developing along the way, or IMDb and FilmAffinity were changing their layout. For example, just yesterday new Chrome version was brought, and my chromedriver didn't work anymore, so I had to download new version of it too. Also, just 2 days ago I learned that FilmAffinity introduced their AKA for some movies, so I had to update FA script again. And so on and on for 4 months. Thus, I learned to appreciate it. The most important, now I even more appreciate EasyVVV's, and especially Ivek's work for more than a decade (!!!) to provide us with PVD alive!

So, humbly, I dedicate this hard work to EasyVVV, but most, and before anyone else I dedicate this to IVEK and to memory to his late mother! I HOPE IVEK CAN IMAGINE  HOW GRATEFUL TO HIM I PERSONALLY AM, AFTER I REALIZED HOW HARD THIS ALL WAS! GOD BLESS YOU IVEK!
« Last Edit: March 18, 2025, 12:39:52 am by afrocuban »

Offline afrocuban

  • Moderator
  • *****
  • Posts: 623
    • View Profile
New Selenium Scripts Configurator for PVD Selenium MOD v4
« Reply #1 on: March 13, 2025, 11:58:59 pm »
As i wrote, I completely rewrote Sripts configurator from the scratch practically in order to be able to bring all the options to automate scripts behaviour. It now is resizable and has scrollbars, so we can put there as much as we wish options from the scripts.

Also, the most of the data can be imported if you check and uncheck options as in the screenshots below. Feel free to test though. Don't be afraid! You can't mess your database, because backup files are created anyway!


READ THE MESSAGES UPON CLICKING "SAVE" BUTTON IN CONFIGURATOR. THOSE MESSAGES ARE VERY INFORMATIVE, HELPFUL AND ESSENTIAL TO UNDERSTAND WHAT IS GOING ON.
« Last Edit: March 14, 2025, 12:09:46 am by afrocuban »

Offline afrocuban

  • Moderator
  • *****
  • Posts: 623
    • View Profile
Re: PVD Selenium MOD v4
« Reply #2 on: March 14, 2025, 12:03:02 am »
Due to the post limit, in this message, Scripts Configurator  window shrinked as a proof of concept, with horizontal and vertical scrollbars. I haven't had no idea how huge challenge was to get scrollbars to this. AHK is pretty hard to get such feature and I lost at least 2 weeks to get it proper, so maybe in the future I will build this with python. I already tested it.

Offline afrocuban

  • Moderator
  • *****
  • Posts: 623
    • View Profile
IMDb ALL-IN-ONE SCRIPT
« Reply #3 on: March 23, 2025, 01:55:13 am »
IMPORTANT!!!

A few hours ago,
IMDb completely changed /fullcredits page html layout, so that page doesn't work any more. Soon, /reference page will be changed too. I know because I got popups offering me to peek to a new "Reference" page. So, until that happen, I will not update scripts, because both pages will share the same code again, and it will be easier to change. For now I made a quick fix everything to work if you check the options in Configurator as I suggested earlier. In addition you have to check "Download the Cast or Credit (text only) provider page to retrieve the full information. Or else, only the info from the main movie page will be downloaded." option and to download fullcredits page too!!! This should work until /reference page changes, or any other page changes meanwhile.

And it happened just when I finished ''all-in one script" while successfully doing final tests. Here's the pack.


Quote
So, with one IMDb Script you get all movies, Series, episode list, and then you apply the same script for episodes.


Also, new search window introduced, with different types of search and countdown of 10 seconds defaulted to "general" search.

It took only 600 additional lines comparing to Movie script, including a lot of commented out lines, and one simple python script to get all of this.

Extract and overwrite existing scripts with this pack.


I will soon start to re-birth AllMovie and RottentTomatoes scripts. I will not revive any other script.
« Last Edit: March 23, 2025, 02:47:15 am by afrocuban »

 

anything