Above is a part of the log output where it is visible that the Function ParsePage_IMDBPeopleAWARDS does not close. I had this in mind before, that the part of the code that would end the Function ParsePage_IMDBPeopleAWARDS is missing.
Wow! Strange things happen! Now I realize what you meant, but it never occur to me since it didn't loop in my case, that's why I didn't understand!
Here is the IMDB_People_[EN][HTTPS]_Awards script, which now correctly transfers Awards data to the awards field for the 'Chico' Hernandez person from the url added below using a Python Selenium script
https://www.imdb.com/name/nm0379491/awards/
I have corrected or added some parts of the code to your code and it works.
Python Selenium script instructions and code will be published probably by the new year in the Integrating Selenium to PVD topic.
http://www.videodb.info/forum_en/index.php/topic,4357.0.html
Thanks! It's so great that you are willing to look in the code I provide
AND HELP! I'm still testing it, and it looks that it properly parses awards inside events, but it always takes the name of the first event (In your case, person had only one award, but in my case there are multiple, and the first is "Ariel Awards, Mexico" and we can see in the log that there are also Oscars, ALMA Awards, and others after that I didn't post, but all added to event "Ariel Awards, Mexico" event):
12/23/2024 8:34:38 PM) Parsed Event: Ariel Awards, Mexico
(12/23/2024 8:34:38 PM) Parsed Award: Golden Ariel
(12/23/2024 8:34:38 PM) Parsed Category: Best Picture (Mejor Película)
(12/23/2024 8:34:38 PM) Parsed Recipient: Roma
(12/23/2024 8:34:38 PM) Parsed Year: 2019
(12/23/2024 8:34:38 PM) Parsed Won: True
(12/23/2024 8:34:38 PM) Before calling AddAward with parameters:
(12/23/2024 8:34:38 PM) Event: Ariel Awards, Mexico
(12/23/2024 8:34:38 PM) Award: Golden Ariel
(12/23/2024 8:34:38 PM) Category: Best Picture (Mejor Película)
(12/23/2024 8:34:38 PM) Recipient: Roma
(12/23/2024 8:34:38 PM) Year: 2019
(12/23/2024 8:34:38 PM) Won: True
(12/23/2024 8:34:38 PM) AddAward executed successfully.
(12/23/2024 8:34:38 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Golden Ariel, Category=Best Picture (Mejor Película), Recipient=Roma, Year=2019, Won: True
(12/23/2024 8:34:38 PM) Parsed Award: Silver Ariel
(12/23/2024 8:34:38 PM) Parsed Category: Best Original Story (Mejor Argumento Original)
(12/23/2024 8:34:38 PM) Parsed Recipient: Love in the Time of Hysteria
(12/23/2024 8:34:38 PM) Parsed Year: 1992
(12/23/2024 8:34:38 PM) Parsed Won: True
(12/23/2024 8:34:38 PM) Before calling AddAward with parameters:
(12/23/2024 8:34:38 PM) Event: Ariel Awards, Mexico
(12/23/2024 8:34:38 PM) Award: Silver Ariel
(12/23/2024 8:34:38 PM) Category: Best Original Story (Mejor Argumento Original)
(12/23/2024 8:34:38 PM) Recipient: Love in the Time of Hysteria
(12/23/2024 8:34:38 PM) Year: 1992
(12/23/2024 8:34:38 PM) Won: True
(12/23/2024 8:34:38 PM) AddAward executed successfully.
(12/23/2024 8:34:38 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Silver Ariel, Category=Best Original Story (Mejor Argumento Original), Recipient=Love in the Time of Hysteria, Year=1992, Won: True
(12/23/2024 8:34:38 PM) Parsed Award: Oscar
(12/23/2024 8:34:38 PM) Parsed Category: Best Achievement in Cinematography
(12/23/2024 8:34:38 PM) Parsed Recipient: Roma
(12/23/2024 8:34:38 PM) Parsed Year: 2019
(12/23/2024 8:34:38 PM) Parsed Won: True
(12/23/2024 8:34:38 PM) Before calling AddAward with parameters:
(12/23/2024 8:34:38 PM) Event: Ariel Awards, Mexico
(12/23/2024 8:34:38 PM) Award: Oscar
(12/23/2024 8:34:38 PM) Category: Best Achievement in Cinematography
(12/23/2024 8:34:38 PM) Recipient: Roma
(12/23/2024 8:34:38 PM) Year: 2019
(12/23/2024 8:34:38 PM) Won: True
(12/23/2024 8:34:38 PM) AddAward executed successfully.
(12/23/2024 8:34:38 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Oscar, Category=Best Achievement in Cinematography, Recipient=Roma, Year=2019, Won: True
(12/23/2024 8:34:38 PM) Parsed Award: Oscar
(12/23/2024 8:34:38 PM) Parsed Category: Best Achievement in Film Editing
(12/23/2024 8:34:38 PM) Parsed Recipient: Gravity
(12/23/2024 8:34:38 PM) Parsed Year: 2007
(12/23/2024 8:34:38 PM) Parsed Won: True
(12/23/2024 8:34:38 PM) Before calling AddAward with parameters:
(12/23/2024 8:34:38 PM) Event: Ariel Awards, Mexico
(12/23/2024 8:34:38 PM) Award: Oscar
(12/23/2024 8:34:38 PM) Category: Best Achievement in Film Editing
(12/23/2024 8:34:38 PM) Recipient: Gravity
(12/23/2024 8:34:38 PM) Year: 2007
(12/23/2024 8:34:38 PM) Won: True
(12/23/2024 8:34:38 PM) AddAward executed successfully.
(12/23/2024 8:34:38 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Oscar, Category=Best Achievement in Film Editing, Recipient=Gravity, Year=2007, Won: True
(12/23/2024 8:34:38 PM) Parsed Award: Saturn Award
(12/23/2024 8:34:38 PM) Parsed Category: Best Writing
(12/23/2024 8:34:38 PM) Parsed Recipient: Gravity
(12/23/2024 8:34:38 PM) Parsed Year: 2014
(12/23/2024 8:34:38 PM) Parsed Won: True
(12/23/2024 8:34:38 PM) Before calling AddAward with parameters:
(12/23/2024 8:34:38 PM) Event: Ariel Awards, Mexico
(12/23/2024 8:34:38 PM) Award: Saturn Award
(12/23/2024 8:34:38 PM) Category: Best Writing
(12/23/2024 8:34:38 PM) Recipient: Gravity
(12/23/2024 8:34:38 PM) Year: 2014
(12/23/2024 8:34:38 PM) Won: True
(12/23/2024 8:34:38 PM) AddAward executed successfully.
(12/23/2024 8:34:38 PM) Added Award to Database: Event=Ariel Awards, Mexico, Award= Saturn Award, Category=Best Writing, Recipient=Gravity, Year=2014, Won: True
(12/23/2024 8:34:38 PM) Parsed Award: ALMA Award
(12/23/2024 8:34:38 PM) Parsed Category: Outstanding Screenplay - Motion Picture
(12/23/2024 8:34:38 PM) Parsed Recipient: Children of Men
(12/23/2024 8:34:38 PM) Parsed Year: 1999
(12/23/2024 8:34:38 PM) Parsed Won: True
Fortunately, or unfortunately, I'm testing with Alfonso Cuaron,
https://www.imdb.com/name/nm0190859/ who has 152 events and several hundred awards, so it should vocer all the cases to be tested.
ONE MORE IMPORTANT THING TO NOTE:For some reason, PVD and script won't work (at least for me) if I manually set the page to be parsed by Function ParsePage_IMDBPeopleAWARDS, like this for example:
// Parse Awards provider page = BASE_URL_AWARD_PERSON
If GET_FULL_AWARDS Then Begin
LogMessage('Starting to parse awards page.');
HTML := ('Tmp\UTF8_NO_BOM-Awards.mhtml');
LogMessage('Read awards page from file: ' + Copy(HTML, 1, 500)); // Log the file content
(I can't remember if this is proper syntax, but I set it properly at the time of the testing, whatever it was)
it wouldn't work without downloading so I had to fake downloading with completely new function:
Function DownloadPage1(URL:AnsiString; FileName:AnsiString):String;
Var
ScriptPath, WebText: String;
Begin
LogMessage(Chr(9)+Chr(9)+'Function DownloadPage1 BEGIN======================|');
LogMessage(Chr(9)+Chr(9)+'Global Var-DownloadURL|'+URL+'|');
LogMessage(Chr(9)+Chr(9)+' Local Var-URL|'+URL+'|');
ScriptPath := GetAppPath + 'Scripts\';
// Directly read the existing file instead of downloading
If FileExists(ScriptPath + FileName) Then Begin
LogMessage(Chr(9)+Chr(9)+' File already exists: '+ScriptPath + FileName);
WebText := FileToString(ScriptPath + FileName);
WebText := ConvertEncoding(WebText, 65001); // Convert to UTF-8
Result := WebText;
LogMessage(Chr(9)+Chr(9)+' Read file content successfully.');
End Else Begin
LogMessage(Chr(9)+Chr(9)+' File does not exist: '+ScriptPath + FileName);
Result := '';
End;
LogMessage(Chr(9)+Chr(9)+'Function DownloadPage1 END======================|');
End;
and then to "call downloading"
// Parse Awards provider page = BASE_URL_AWARD_PERSON
If GET_FULL_AWARDS Then Begin
LogMessage('Starting to parse awards page.');
HTML := DownloadPage1(DownloadURL, 'Tmp\UTF8_NO_BOM-Awards.mhtml');
LogMessage('Read awards page from file: ' + Copy(HTML, 1, 500)); // Log the file content
When I reach the phase of passing
URL TO SELENIUM TO DOWNLOAD THE PAGE, I'm still not sure how it will work in .psf: will I have to fake download after Selenium passes back the page, or whatever. For someone not knowing how to code, this is too much to comprehend without actual trials.