Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - afrocuban

Pages: [1] 2 3 4 5 6 ... 32
1
PVD Python Scripts / IMDb ALL-IN-ONE SCRIPT
« on: March 23, 2025, 01:55:13 am »
IMPORTANT!!!

A few hours ago,
IMDb completely changed /fullcredits page html layout, so that page doesn't work any more. Soon, /reference page will be changed too. I know because I got popups offering me to peek to a new "Reference" page. So, until that happen, I will not update scripts, because both pages will share the same code again, and it will be easier to change. For now I made a quick fix everything to work if you check the options in Configurator as I suggested earlier. In addition you have to check "Download the Cast or Credit (text only) provider page to retrieve the full information. Or else, only the info from the main movie page will be downloaded." option and to download fullcredits page too!!! This should work until /reference page changes, or any other page changes meanwhile.

And it happened just when I finished ''all-in one script" while successfully doing final tests. Here's the pack.


Quote
So, with one IMDb Script you get all movies, Series, episode list, and then you apply the same script for episodes.


Also, new search window introduced, with different types of search and countdown of 10 seconds defaulted to "general" search.

It took only 600 additional lines comparing to Movie script, including a lot of commented out lines, and one simple python script to get all of this.

Extract and overwrite existing scripts with this pack.


I will soon start to re-birth AllMovie and RottentTomatoes scripts. I will not revive any other script.

2
Thanks!

3
Thanks Ivek.


here's the assesment of these 3 snippets:


The only thing from the first 2 snippets  I see is that commented out lines are deleted from the original. I compared everything in Notepad++:

Quote
      //MovieURL := 'http://www.imdb.com' + TextBetWeenFirst(ItemList, '", "url":"', '", "name":"');
Quote
      //LogMessage('Function ParsePage_IMDBMovieBASE -   *   Get result url 1: ' + MovieURL + ' | |');

Quote
         // If titleValue = '' then titleValue := TextBetWeenFirst(ItemList, '<h1 class="long">', '<'); // Strings which opens/closes the data. WEB_SPECIFIC



These lines, which were already commented out, are just deleted as "improvement" but they don't influence anything anyway, so I am not sure what improvement is there? With or without them, everything works anyway.

Also, the third snippets works too, it's just the other way to achieve the same. There are so many ways to achieve the same goal, and I choose one, that works too.

I hope to see what doesn't work actually in the scripts.



4
Question:
Which script creates seasons in PVD?

5
For my personal use, I will disable all the duplicates of the custom fields once I set all the scripts. We were foolish we were allowed to ask to have like 6 custom fields for Aspect ratio, for example. I just brought them all so users can decide which one they want. If some are empty, it's either they don't exist for the given movie, or they are set by Configurator not to be brought. Logic of the script is extremely complex. But let's see.

6
While we are waiting Ivek to decide if I will continue to share my script, I have integrated Series script into Movie Script and upgraded search with the coundtdown for the default "general" search. In the screenshoot you can see search window option, as well as series imported with movie script

7
Thanks, Ivek. As usual, if you could be more specific, it would be helpful.

8
Thanks for the explanation. I'm looking for ways how to automate this without limiting on number os seasons and episodes...

9
Thanks ivek. I already looked at those scripts of course, but what I don't get is is how PVD knows when first season is finished. When Function ParsePage_IMDBMovieSEASON1 stops with AddEpisode, and Function ParsePage_IMDBMovieSEASON2 starts?

10
I must have made a mistake, I just didn't notice it (I still have a lot of things to sort out for my mother's passing, so some details are missing and I don't notice them). I'll fix that and see if it works.

I fixed it now and it works perfectly.

Great! Enjoy it!

May I ask you a question? Can you describe the flow of adding series, seasons and episode links? Which script does which task? I'm almost done integrating series script into movie script, but I am stuck at the moment what generates seasons and what generates  episodes in each season, what provides links to the episodes and so on. Thank you in advance.

11
PVD Python Scripts / Re: Firefox Selenimu Script Discussion
« on: March 15, 2025, 10:57:33 pm »

Interesting, I also did a test with Chromedriver (without using the Chrome browser), and it works great there.


I am so happy to hear it works!

Here are the Scripts adapted for firefox so that you can see my adaptation and maybe errors or I did something wrong or did not change anything.


The script looks mostly fine, but there is an issue in the following lines:

Quote
firefox_options_options.add_argument("--headless")  # Running Firefox in headless mode
firefox_options_options.add_argument(f"--lang={language_code}")
firefox_options_options.add_argument("--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36")
firefox_options_options.add_argument("--enable-unsafe-swiftshader") # Add this flag to use unsafe SwiftShader

Here, firefox_options_options is used, but the variable should be firefox_options. The correct lines should be:

Quote
firefox_options.add_argument("--headless")  # Running Firefox in headless mode
firefox_options.add_argument(f"--lang={language_code}")
firefox_options.add_argument("--user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36")
firefox_options.add_argument("--enable-unsafe-swiftshader") # Add this flag to use unsafe SwiftShader

Once this change is made, the script should work as expected. Other than that, the syntax looks correct! Just make sure you have the necessary dependencies installed and the correct path to geckodriver.

12
PVD Python Scripts / Re: Firefox Selenimu Script Discussion
« on: March 15, 2025, 07:37:21 pm »

No, I didn't rename Selenium_Chrome_Base_page_v4.py or the other python files, because it would be too time-consuming to do it everywhere in the .psf files. Everything I did in all the python files, I only changed the settings on firefox and geckodriver.
When the driver is called in python script, is it called as chrome, or gecko? It would be good to share you script so we could visually see how you adjusted it.

Quote

That's all that was written in python_script_base_page.txt.
Quote
2025-03-15 10:08:08,399 - DEBUG - Starting the Python script.
2025-03-15 10:08:08,403 - DEBUG - Starting new HTTP connection (1): ipinfo.io:80
2025-03-15 10:08:08,641 - DEBUG - http://ipinfo.io:80 "GET /country HTTP/1.1" 200 27
2025-03-15 10:08:08,642 - DEBUG - Country code: SI, Language code: sl


This why I suspect driver isn't called at all... Try to test the script from the cmd and you will get more informative response. For the title search:

Quote
pyhton FullPathToTheScript titleIMDb "10 Things...." (with the double qoutes, or single quotes, it depends on your setting, try them both)

for the main page:

Quote
pyhton FullPathToTheScript "MovieURL" "FullPathToThe\downpage-UTF8_NO_BOM.htm" (with the double qoutes, or single quotes, it depends on your setting, try them both)

13
PVD Python Scripts / Re: Firefox Selenimu Script Discussion
« on: March 15, 2025, 08:31:48 am »
Ok, that's better. Let's move debugging to the other topic, you may call it Firefox Selenium?  Several ideas:
1. Did you rename any files, in this case did you rename Selenium_Chrome_Base_page_v4.py? If so, rename everywhere in the .psf too.
2. What says in the correspondent base.log file in \Tmp folder?

14
I am sorry to hear that. If you could be a bit more specific maybe I'd get an idea what it might be.

In the meantime, I have started to work on upgrading Selenium Chrome search script to be "one for all". Now you can chose between different title types as I grouped them, plus when importing for example series and movies at the same time with "Tools->Scan folders...", now you can use "general" search. I will try to merge IMDb Movie and Series script, and hopefully Episodes at the end. There aren't much differencies at the first glance analyzing them

15
PVD Python Scripts / Re: Firefox Selenimu Script Discussion
« on: March 14, 2025, 12:04:39 pm »
Thanks for the comprehensive explanation for Firefox browsers.

Thank you very much.

You are more than welcome, Ivek. I never tried it, so I am not sure at all how Firefox would download pages (clicking "See more" pages, "Storyline" sections and other), and if final html code would be the same as downloaded with Chrome, so it might be frustrating to realize that there are differences actually in scraped hmtls with either.

P.S. In people script, I brought back career option to base function too, so just make sure the proper switch (ShouldParseCareer) is set not to parse it with bio function.

16
PVD Python Scripts / Re: Firefox Selenimu Script Discussion
« on: March 14, 2025, 11:34:39 am »
Question

How do I fix this for those of us who use Firefox browser instead of Chrome browser?



To adapt your existing Chrome Selenium script to use Firefox with geckodriver, you'll need to make a few key changes. Here's a detailed explanation of the necessary modifications:
 1. Change WebDriver to Firefox Selenium has a Firefox WebDriver (webdriver.Firefox()), similar to how you're using the Chrome WebDriver (webdriver.Chrome()).
 2. Install and Use Geckodriver You need to install geckodriver (the WebDriver for Firefox) and make sure it is available in your system’s PATH or specify its path explicitly.
 3. Modify Chrome-Specific Options to Firefox-Specific Options Some Chrome-specific options (like chrome_options) need to be replaced with their Firefox counterparts. The FirefoxOptions object is used to set browser-specific configurations.
 4. Remove Chrome-specific arguments and replace them with Firefox-specific ones For Firefox, you would use FirefoxOptions and its methods instead of ChromeOptions.
 

Step-by-Step Adaptation
  • Install Firefox and Geckodriver
     
    • Firefox: If you don’t have Firefox installed already, you can install it from the official website.
    • Geckodriver: You can download it from the Geckodriver GitHub releases page. Make sure to download the version that matches your operating system and place it in a directory that's included in your PATH or specify the path in the script.
  • Modify Imports You need to import the Firefox-specific classes instead of Chrome.
    Quote
    from selenium.webdriver.firefox.service import Service as FirefoxService
    from selenium.webdriver.firefox.options import Options as FirefoxOptions
    from selenium.webdriver.common.by import By 
  • Set Firefox Options Replace the chrome_options with firefox_options. Also, you will replace Chrome-specific arguments with their Firefox equivalents.
    Quote
    # Set Firefox options
    firefox_options = FirefoxOptions() firefox_options.add_argument("--headless")  # Running Firefox in headless mode
    firefox_options.add_argument(f"--lang={language_code}")  # Set language
    firefox_options.set_preference("general.useragent.override", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Firefox/91.0 Safari/537.36")  # User agent 
    Note that Firefox has some different preferences compared to Chrome. For example, setting the user agent or using specific preferences like general.useragent.override is handled differently.
  • Replace Chrome WebDriver with Firefox WebDriver When initializing the WebDriver, use webdriver.Firefox instead of webdriver.Chrome.
    Quote
    # Path to geckodriver
    geckodriver_path = os.path.join(current_dir, "geckodriver.exe")
    # Ensure geckodriver exists
    if not os.path.exists(geckodriver_path): logging.error(f"Geckodriver not found at path: {geckodriver_path}")
    sys.exit(f"Geckodriver not found at path: {geckodriver_path}")
    # Initialize FirefoxDriver
    service = FirefoxService(executable_path=geckodriver_path)
    driver = webdriver.Firefox(service=service, options=firefox_options)
    logging.info(f"Geckodriver started from: {geckodriver_path}") 
  • Make Other Necessary Changes for Firefox You don't need to make major changes for general functionality, but if you're handling Firefox-specific settings (like cookies or scrolling behavior), you may need to adjust those based on Firefox’s behavior.


Full Example of Key Changes:Here’s an example of how you can modify the initialization and WebDriver setup for Firefox:
Quote
# Import the necessary components for Firefox
from selenium.webdriver.firefox.service import Service as FirefoxService
from selenium.webdriver.firefox.options import Options as FirefoxOptions
from selenium.webdriver.common.by import By
import os
import logging
import sys



# Set Firefox options
firefox_options = FirefoxOptions()firefox_options.add_argument("--headless")  # Running Firefox in headless mode
firefox_options.add_argument(f"--lang={language_code}") # Set language
firefox_options.set_preference("general.useragent.override", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Firefox/91.0 Safari/537.36")  # User agent

# Path to geckodriver
geckodriver_path = os.path.join(current_dir, "geckodriver.exe")

# Ensure geckodriver exists
if not os.path.exists(geckodriver_path):
    logging.error(f"Geckodriver not found at path: {geckodriver_path}")
    sys.exit(f"Geckodriver not found at path: {geckodriver_path}")

# Initialize FirefoxDriver
service = FirefoxService(executable_path=geckodriver_path)
driver = webdriver.Firefox(service=service, options=firefox_options)
logging.info(f"Geckodriver started from: {geckodriver_path}")
Summary of Changes:
  • Import the FirefoxService and FirefoxOptions.
  • Set Firefox-specific options (firefox_options), such as headless mode and language settings.
  • Initialize the webdriver.Firefox instead of webdriver.Chrome with the appropriate service and options.
  • Ensure the path to geckodriver is correct.


That should be all you need to switch from Chrome to Firefox using Selenium and Geckodriver! The rest of your script (e.g., interacting with elements, taking screenshots, saving the page) should work with minimal modification because Selenium provides a consistent API for interacting with different browsers.

17
Support / Re: List of custom fields
« on: March 14, 2025, 12:33:54 am »
Here you can find updated list of now all custom and original fields, with additional sheet for my Selenium scripts that can be found here.


Now we know which script, and which function/internet page inside each script brings which fields to PVD, so you can create your own tactics which pages to download and which not.

18
PVD Python Scripts / Re: PVD Selenium MOD v4
« on: March 14, 2025, 12:03:02 am »
Due to the post limit, in this message, Scripts Configurator  window shrinked as a proof of concept, with horizontal and vertical scrollbars. I haven't had no idea how huge challenge was to get scrollbars to this. AHK is pretty hard to get such feature and I lost at least 2 weeks to get it proper, so maybe in the future I will build this with python. I already tested it.

19
As i wrote, I completely rewrote Sripts configurator from the scratch practically in order to be able to bring all the options to automate scripts behaviour. It now is resizable and has scrollbars, so we can put there as much as we wish options from the scripts.

Also, the most of the data can be imported if you check and uncheck options as in the screenshots below. Feel free to test though. Don't be afraid! You can't mess your database, because backup files are created anyway!


READ THE MESSAGES UPON CLICKING "SAVE" BUTTON IN CONFIGURATOR. THOSE MESSAGES ARE VERY INFORMATIVE, HELPFUL AND ESSENTIAL TO UNDERSTAND WHAT IS GOING ON.

20

IF YOU DON'T READ THIS POST CAREFULLY AND FOLLOW EVERYTHING WRITTEN HERE, BUT JUST DOWNLOAD FILES, I COULD BET IT WILL NOT WORK FOR YOU AND YOU WILL COME BACK HERE ASKING QUESTIONS ALREADY ANSWERED IN THIS POST.


Almost 4 months after for the first time ever I heard the word "Selenium" knowing nothing about programming, I am finally bringing practically new PVD MOD considering amount of files and programs brought. It consists of the scripts and program as described here.

You now need only one script for IMDb movies, one for IMDb people and one for FilmAffinity movies for everything: search and download. Selenium scripts in the background are doing all "external" work, so in your PVD you have clean situation: 2 scripts and configurator for movies (plus .batch file for these 2 at your will), and one script and configurator for the people. Check the screenshots below.

I strongly suggest to rename your current "Scripts" folder to, for instance, "Scripts-Original", and to put this Scripts.7z in your PVD folder and extract it there. It will create "Scripts" folder with all the scripts and files needed for the PVD to work (as a bonus, I'm contributing source code for the Scripts Configurator program, as well as updated and polished UDL file for PVD scripts in Notepad++ that is just to be imported to Notepad++).  If you want to, after testing you can merge two folders, Selenium and non-Selenium scripts and files into "Scripts" folder.

Before that....
As stated here
ensure that:


Quote
A. You installed python
B. You installed selenium via cmd, with
Quote
pip install selenium

B. You installed requests via cmd, with
Quote
pip install requests

C. You have your Chrome bin on a PATH (to test this, open cmd and simply type "chrome" and check if Chrome opens).
D. You have Python folder on your PATH (to test this, open cmd and simply type "python --version" and check if got the proper feedback, for instance:
Quote
C:\Users\user>python --version
Python 3.12.6
E. pythonw.exe is not missing, or it's containing folder is on the PATH (to test this, open cmd and simply type "pythonw" and check if got the proper feedback, for instance:
Quote
C:\Users\user>pythonw

C:\Users\user> (empty output)


These scripts:


Quote
1. Use Chrome browser instead Firefox
2. Use chromedriver.exe instead geckodriver
3. Start chromedriver.exe silently
4. Silently invoke browser in a headless mode (no pop-up windows of browser)
5. Scrape .htm pages of a given urls
6. No path is needed to set manually inside the script - it is set to be relative to the path of selenium script!


You just use your PVD as ever, just be sure to extract as instructed above.

For using relative path, ensure:

Quote
6B. You put appropirate chromedriver.exe to the "Script" folder, too. There is no installation for chromedriver, just extract it from the .zip file into your "Scripts" folder described above. IMPORTANT!!!! You need to download chromedriver.exe of the same version as your Chrome browser. At the moment of this post, stable version is v134. You can find Crome browser download and appropriate chromedriver here. For example, for v134, Stable links are:
Chrome browser:

1. chrome   win32   https://storage.googleapis.com/chrome-for-testing-public/134.0.6998.88/win32/chrome-win32.zip
2. chrome   win64   https://storage.googleapis.com/chrome-for-testing-public/134.0.6998.88/win64/chrome-win64.zip
Chromedriver:

1. chromedriver   win32   https://storage.googleapis.com/chrome-for-testing-public/134.0.6998.88/win32/chromedriver-win32.zip
2. chromedriver   win64   https://storage.googleapis.com/chrome-for-testing-public/134.0.6998.88/win64/chromedriver-win64.zip

From this point on, everything is automated and headless, silent as never before.

Amount of data imported is huge! I have included dozens of new custom fields. IF YOU ARE DATA HUNGRY AS I AM, THE MOST DATA CAN BE COLLECTED IF YOU CHECK AND UNCHECK OPTIONS AS IN THE SCRIPTS CONFIGURATOR SCREENSHOTS BELOW. The updated table with all the fields in these 3 scripts can be found here. To see them all, you have to use Classic skin, or you have to add custom fields to your PVD and from there to create your own custom skin, or to use one of my skins from here, once I complete them and adjust them for the final Selenium v4 scripts.


Examine the table. That is the only way you will learn what fields comes from which movie/person page and if you want them or not. The less you want, the faster PVD will be.

Please feel free to test the scripts and give me a feedback if something doesn't work. When I say "it doesn't work", everything works, the only issue that can happen is that sites have changed html layout and again not all fileds are available, or you updated your Chrome automatically to a higher version and you didn't download and extract corespondent chromedriver version. The best indicator for this is that no .log file is created in "\Scripts\Tmp\" folder. I will update whatever you report in a month at most from the first report, to give us all the time to collect and report as many as possible issues.

What I have learned


On this long journey, what I have learned was how hard work coding is. Also, I had to learn pretty deep about Pascal/Delphi, about python, and most frustratingly - ahk! I had to revise all the scripts from the scratch several times. It was either because of the concepts i was developing along the way, or IMDb and FilmAffinity were changing their layout. For example, just yesterday new Chrome version was brought, and my chromedriver didn't work anymore, so I had to download new version of it too. Also, just 2 days ago I learned that FilmAffinity introduced their AKA for some movies, so I had to update FA script again. And so on and on for 4 months. Thus, I learned to appreciate it. The most important, now I even more appreciate EasyVVV's, and especially Ivek's work for more than a decade (!!!) to provide us with PVD alive!

So, humbly, I dedicate this hard work to EasyVVV, but most, and before anyone else I dedicate this to IVEK and to memory to his late mother! I HOPE IVEK CAN IMAGINE  HOW GRATEFUL TO HIM I PERSONALLY AM, AFTER I REALIZED HOW HARD THIS ALL WAS! GOD BLESS YOU IVEK!

Pages: [1] 2 3 4 5 6 ... 32
anything