Author Topic: IMDB Script not working for downloading movie Description correctly.  (Read 7119 times)

0 Members and 1 Guest are viewing this topic.

Offline leogets

  • User
  • ***
  • Posts: 75
    • View Profile
Out of 13 movies I currently tried to add to my database over half were found to have downloaded crude for the movie Description.
Such as "Did you knowTrivia...."  this is not the actual description for the movie(s) I had to you use the links for the movies, go to the imdb site and and actually copy the the movie description and then add it manually; copying over the crap it put there.
I would like to know if you are intending to post a new script as a fix for this?
Some movies I added where this happened were:
Attack On Finland    http://www.imdb.com/title/tt11636880/
Spiderhead    http://www.imdb.com/title/tt9783600/


Thanks for any help.
« Last Edit: June 19, 2022, 02:29:05 am by leogets »

Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 2711
    • View Profile
The description is no longer downloaded because the text in the source code of the page is blocked between Storyline and Did you know. I will fix as much as possible, but it will take a while because I have less time to repair the script at the moment.
Ivek23
Win 10 64bit (32bit)   PVD v0.9.9.21, PVD v1.0.2.7, PVD v1.0.2.7 + MOD


Offline leogets

  • User
  • ***
  • Posts: 75
    • View Profile
Thanks Ivek23

One more thing I noticed is that out of 35 of the last movies I added to my database, 27 of them are missing the Genre.
Is there anyway you can make a simple script that will allow me download the Genre for the selected movies that need it to be updated?


Thanks for all your help.

Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 2711
    • View Profile
No problem. In addition to downloading the genre, it will also download or update some normal fields and some custom fields. An updated full new version of the IMDB_ [EN] [HTTPS] script will follow in a few days or sooner.
Ivek23
Win 10 64bit (32bit)   PVD v0.9.9.21, PVD v1.0.2.7, PVD v1.0.2.7 + MOD


Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 2711
    • View Profile
The IMDB_ [EN] [HTTPS] _TEST_1 has been updated here to test if the script correctly downloads Genre, Description, Taglines, Production Co, User Reviews and MiniSoundtracks. I am doing additional tests with another script to make the Bing results search function work as well. This will be added to the next official new IMDB_ [EN] [HTTPS] script
version.

Script is added.
Ivek23
Win 10 64bit (32bit)   PVD v0.9.9.21, PVD v1.0.2.7, PVD v1.0.2.7 + MOD


Offline leogets

  • User
  • ***
  • Posts: 75
    • View Profile
Thank you Ivek23 for all your hard work in keeping PVD up and running.

In the final script I had to change line 86 to this 'SCRIPT_NAME  = 'IMDB [EN][HTTPS]_TEST_1''  so I could easily spot it to make it seeable and selectable. Just trying it out now will let you know if I have any more problems.  Thanks again I will be looking to donate again soon.   

edit:
Just finished updating my database with 7 new movies using the 'IMDB_ [EN] [HTTPS] _TEST_1' after apply the change above and everything works perfect for what I'm looking to add to my database for each of the movies.

Thank You.


I was just wondering is there anyway to set the size of the cover image for each of the movies in the database.  I like to set my size to 318 (width) x 424 (Height) pixels.  To do this however, I save the new images to a folder, do a batch resize then import each back in.
« Last Edit: July 02, 2022, 02:27:47 pm by leogets »

Offline leogets

  • User
  • ***
  • Posts: 75
    • View Profile
Hello Ivek23,

Just finished updating the movie database once more using 'IMDB_ [EN] [HTTPS] _TEST_1';  2 movies that downloaded the Description also downloaded a bunch of jibber-jabber jargin thereafter as well, making the Description very lengthy; requiring me to manually clean it up.  Perhaps you could test the movies yourself to see if a fix should be applied to the script.
They were both animation type movies:
The Bobs Burgers Movie (2022)    and
The Sea Beast (2022)

Edit:
Added another movie 'The Phantom Of The Open (2021)' and it downloaded same jargin for Description+ above.

Edit:
Jurassic World Dominion (2022) -- Description added to database after download is nothing like the Description presented on IMDB's site for this movie.

The Black Phone (2021) -- Description added to database after download is similar to the Description presented on IMDB's site for this movie but it is not the same.
Example:
Downloaded Description added to database:
Finney Blake is a shy but clever 13-year-old boy who is abducted by a sadistic killer and trapped in a soundproof basement where screaming is of no use. When a disconnected phone on the wall begins to ring, Finney discovers that he can hear the voices of the killer's previous victims. And they are dead-set on making sure that what happened to them doesn't happen to Finney.

Description given on IMDB's site:
After being abducted by a child killer and locked in a soundproof basement, a 13-year-old boy starts receiving calls on a disconnected phone from the killer's previous victims.
« Last Edit: July 15, 2022, 11:04:29 am by leogets »

Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 2711
    • View Profile
Explanation:

A change to the source code of the basic Movie IMDb web pages caused problems in fixing parts of the code in the IMDb script. Storylines, the description of which you can see in the Storyline section of the basic Movie IMDb web pages, it is not possible to add such code to the IMDb script, so that the script would then transfer it. The description, which you see in the black frame under the movie titles, does not match the description in the Storyline section. The Storyline (Plot Summary) code from the Reference View web pages is therefore added to the IMDb script. If the Plot Summary description is not on the Reference View web pages, then the description is transferred at the top under the movie titles.

The Black Phone (2021) -- Description added to database after download is similar to the Description presented on IMDB's site for this movie but it is not the same.
Example:
Downloaded Description added to database:
Finney Blake is a shy but clever 13-year-old boy who is abducted by a sadistic killer and trapped in a soundproof basement where screaming is of no use. When a disconnected phone on the wall begins to ring, Finney discovers that he can hear the voices of the killer's previous victims. And they are dead-set on making sure that what happened to them doesn't happen to Finney.

In this case, you have downloaded the description, which is visible in the Storyline section with the help of the Storyline code from the Reference View web pages.

The Black Phone (2021) -- Description added to database after download is similar to the Description presented on IMDB's site for this movie but it is not the same.
Example:

Description given on IMDB's site:
After being abducted by a child killer and locked in a soundproof basement, a 13-year-old boy starts receiving calls on a disconnected phone from the killer's previous victims.

You can see this written in a black frame under the movie titles, which does not match the description in the Storyline section or maybe even in the Reference View of the movie website.



For the rest of the above, I'll check to see what's wrong.
Ivek23
Win 10 64bit (32bit)   PVD v0.9.9.21, PVD v1.0.2.7, PVD v1.0.2.7 + MOD


Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 2711
    • View Profile
Edit:
Jurassic World Dominion (2022) -- Description added to database after download is nothing like the Description presented on IMDB's site for this movie.

The same explanation as for The Black Phone (2021) applies.

Just finished updating the movie database once more using 'IMDB_ [EN] [HTTPS] _TEST_1';  2 movies that downloaded the Description also downloaded a bunch of jibber-jabber jargin thereafter as well, making the Description very lengthy; requiring me to manually clean it up.  Perhaps you could test the movies yourself to see if a fix should be applied to the script.
They were both animation type movies:
The Bobs Burgers Movie (2022)    and
The Sea Beast (2022)

Edit:
Added another movie 'The Phantom Of The Open (2021)' and it downloaded same jargin for Description+ above.

I added this piece of code and it works ok.
Code: [Select]
        curPos:=Pos('Plot ',ItemValue);                               //WEB_SPECIFIC.
        If 0<curPos then ItemValue:=Copy(ItemValue,0,curPos-1);



Quote
Function ParsePage_IMDBMovieREFERENCE(HTML:String):Cardinal; //BlockOpen
    //Returns:
    //     Result:=prFinished; Script has finished gathering data
    //     Result:=prError; If zany big problem? with exit
    //Retrieve: REFERENCE~
  Var
    curPos,endPos:Integer;   
    ItemValue,ItemList:String;
    //Category,Name:String;
   debug_pos1:Integer;
   ItemValue22:String;
   ItemValue0,ItemValue1,ItemValue2:String;      
   ItemValue10,ItemValue11,ItemValue12,ItemList2:String;      
  Begin
    LogMessage('Function ParsePage_IMDBMovieREFERENCE BEGIN=====================||');
    Result:=prFinished;  //It will change to prError if any big problem with exit;
   
   //(*
   //~Imdb Title 1~
   If Pos('<div id="content-2-wide" class="redesign">',HTML)>0 Then Begin
      curPos:=PosFrom('<h3 itemprop="name">',HTML,EndPos);
      //curPos:=curPos+Length('<h3 itemprop="name">');
      EndPos:=PosFrom('<section class="titlereference-section-overview">',HTML,curPos);
      ItemList:=Copy(HTML,curPos,endPos-curPos);
      //ItemList:=Trim(Copy(HTML,curPos,endPos-curPos));   
      //LogMessage(#13+'           Parse results ('+IntToStr(curPos)+','+IntToStr(endPos)+') complex ItemList:'+ItemList+'||'+#13);
      if (Length(ItemList)>0) then begin   
         ItemValue0:=TextBetWeenFirst(ItemList,'<h3 itemprop="name">','<span class="titlereference-title-year">');
         if ItemValue0 <> '' then LogMessage('      Get result ItemValue0:'+ItemValue0+'||');   
         if ItemValue0 <> '' then ItemList2:=ItemList2+ItemValue0+' ';
         ItemValue10:=TextBetWeenFirst(ItemList,'<span class="titlereference-title-year">','</span>');
         if ItemValue10 <> '' then LogMessage('      Get result ItemValue10:'+ItemValue10+'||');
         if ItemValue10 <> '' then ItemList2:=ItemList2+ItemValue10+#13;
         ItemValue1:=TextBetWeenFirst(ItemList,'</h3>','</span>');
         if ItemValue1 <> '' then LogMessage('      Get result ItemValue1:'+ItemValue1+'||');
         debug_pos1:=Pos('if (',ItemValue1);
         if debug_pos1 >0 then ItemValue1 := Copy(ItemValue1,0,debug_pos1-1);
         if ItemValue1 <> '' then LogMessage('      Get result ItemValue1a:'+ItemValue1+'||');         
         if ItemValue1 <> '' then ItemList2:=ItemList2+ItemValue1+#13;
         ItemValue11:=HTMLValue(HTML,curPos,0,'<ul class="ipl-inline-list">','</ul>');
         if ItemValue11 <> '' then LogMessage('      Get result    ItemValue11:'+ItemValue11+'||');
         ItemValue11:=StringReplace(ItemValue11,'                                        ','  '+#8729+'  ',True,False,True);
         ItemValue11:=StringReplace(ItemValue11,'                        ','  '+#8729+'  ',True,False,True);
         ItemValue11:=StringReplace(ItemValue11,'                                ','  '+#8729+'  ',True,False,True);   
         ItemValue11:=StringReplace(ItemValue11,'        ','',True,False,True);
         if ItemValue11 <> '' then ItemList2:=ItemList2+ItemValue11+#13;
         ItemValue2:=TextBetWeenFirst(ItemList,'<span class="ipl-rating-star__rating">','</span>');
         if ItemValue2 <> '' then LogMessage('      Get result ItemValue2:'+ItemValue2+'||');
         if ItemValue2 <> '' then ItemList2:=ItemList2+ItemValue2;
         ItemValue12:=TextBetWeenFirst(ItemList,'<span class="ipl-rating-star__total-votes">','</span>');
         if ItemValue12 <> '' then LogMessage('      Get result ItemValue12:'+ItemValue12+'||');
         if ItemValue12 <> '' then ItemList2:=ItemList2+'  '+#8226+'  '+ItemValue12;
         //Write to ~features~ field
         //if (Length(ItemList2)>0) then begin
         //   AddCustomFieldValueByName('Imdb Title 1',ItemList2); //Ivek23 CustomField ~ImdbTechSpecs~ for ~features~
            LogMessage('      Get result Movie ~Features~ (CF~ImdbTechSpecs~):'+ItemList2+'||');
         //End;      
         //ItemList2:=ItemValue0+' '+ItemValue10+#13+ItemValue1+#13+ItemValue11+#13+ItemValue2+'  '+#8226+'  '+ItemValue12;      
         If ItemList2 <> '' then AddCustomFieldValueByName('Imdb Title 1',ItemList2);   
      End;      
   End;
   //*)      
   
    //Get "Storyline" as ~description~  **
    curPos:=Pos('<section class="titlereference-section-overview">',HTML);                                 //WEB_SPECIFIC.
    If 0<curPos then begin
        ItemValue22:=TextBetWeen(HTML,'<div>','</div>',false,curPos);        //Strings which opens/closes the data. WEB_SPECIFIC   
        //ItemValue:=StringReplace(ItemValue, 'Edit', '', true, false, true);  //Cleaning. WEB_SPECIFIC.      
        //ItemValue:=StringReplace(ItemValue, 'Industry information at your fingertips', '', true, false, true);  //Cleaning. WEB_SPECIFIC.
        //ItemValue:=StringReplace(ItemValue, 'Some parts of this page won'+#39+'t work property. Please reload or try later.', '', true, false, true); //Cleaning. WEB_SPECIFIC.
        curPos:=Pos('Season',ItemValue22);   //WEB_SPECIFIC.
        If 0<curPos then ItemValue22:=Copy(ItemValue22,0,curPos-1);
      curPos:=Pos('Seasons',ItemValue22);   //WEB_SPECIFIC.      
        If 0<curPos then ItemValue22:=Copy(ItemValue22,0,curPos-1);
        //AddFieldValueXML('description',ItemValue22);
        if ItemValue22 <> '' then LogMessage('      Get result description2:'+#13+ItemValue22+'||');
    End;      

    //Get "Plot Summary" as ~description~
    curPos:=Pos('<td class="ipl-zebra-list__label">Plot Summary</td>',HTML);                                 //WEB_SPECIFIC.
    If 0<curPos then begin
        ItemValue:=TextBetWeen(HTML,'<p>','<p>',false,curPos);        //Strings which opens/closes the data. WEB_SPECIFIC
        ItemValue:=StringReplace(ItemValue, 'Edit', '', true, false, true);  //Cleaning. WEB_SPECIFIC.      
        ItemValue:=StringReplace(ItemValue, 'Industry information at your fingertips', '', true, false, true);  //Cleaning. WEB_SPECIFIC.
        ItemValue:=StringReplace(ItemValue, 'Some parts of this page won'+#39+'t work property. Please reload or try later.', '', true, false, true); //Cleaning. WEB_SPECIFIC.
        curPos:=Pos('—',ItemValue);                               //WEB_SPECIFIC.
        If 0<curPos then ItemValue:=Copy(ItemValue,0,curPos-1);
        curPos:=Pos('Written by ',ItemValue);                               //WEB_SPECIFIC.
        If 0<curPos then ItemValue:=Copy(ItemValue,0,curPos-1);   
        curPos:=Pos('Plot ',ItemValue);                               //WEB_SPECIFIC.
        If 0<curPos then ItemValue:=Copy(ItemValue,0,curPos-1);      
        //AddFieldValueXML('description',ItemValue);
        LogMessage('      Get result description:'+#13+ItemValue+'||');
    End;   
   
   if (ItemValue = '') AND (ItemValue22 <> '') then
   AddFieldValueXML('description',ItemValue22);
   if (ItemValue <> '') AND (ItemValue22 <> '') then
   AddFieldValueXML('description',ItemValue);
   
    //Get ~tags~ "keywords" (field with several values in a comma separated list)
   If Not(GET_FULL_PLOTKEYWORDS) Then Begin
      curPos:=Pos('<td class="ipl-zebra-list__label">Plot Keywords</td>',HTML);                                      //WEB_SPECIFIC.IC.
      If 0<curPos Then Begin
         EndPos:=curPos;
         ItemValue:=HTMLValues(HTML,'<td class="ipl-zebra-list__label">Plot Keywords</td>','</ul>','<li class="ipl-inline-list__item">','</li>',', ',endPos);
         curPos:=Pos('See All',ItemValue);                               //WEB_SPECIFIC.
         If 0<curPos then ItemValue:=Copy(ItemValue,0,curPos-1);            
            AddFieldValueXML('tags',ItemValue);
            if ItemValue <> '' then LogMessage('      Get results Tags:'+ItemValue+'||');
       End;
    End;   

    //Get ~tagline~
    curPos:=Pos('<td class="ipl-zebra-list__label">Taglines</td>',HTML);                                 //WEB_SPECIFIC.
    If 0<curPos then begin
        ItemValue:=TextBetWeen(HTML,'<td>','<td>',false,curPos);        //Strings which opens/closes the data. WEB_SPECIFIC
        ItemValue:=StringReplace(ItemValue, 'Edit', '', true, false, true);  //Cleaning. WEB_SPECIFIC.   
       curPos:=Pos('See more',ItemValue);                               //WEB_SPECIFIC.
        If 0<curPos then ItemValue:=Copy(ItemValue,0,curPos-1);   
        AddFieldValueXML('tagline',ItemValue);
        if ItemValue <> '' then LogMessage('      Get result tagline:'+ItemValue+'||');
    End;   
      
    LogMessage('Function ParsePage_IMDBMovieREFERENCE END=====================||');
  End; //BlockClose

Copy the entire code section below and paste it into the Imdb script to replace the entire old Function ParsePage_IMDBMovieREFERENCE code section.



But you can replace only this part of the code in Function ParsePage_IMDBMovieREFERENCE

Quote
    //Get "Plot Summary" as ~description~
    curPos:=Pos('<td class="ipl-zebra-list__label">Plot Summary</td>',HTML);                                 //WEB_SPECIFIC.
    If 0<curPos then begin
        ItemValue:=TextBetWeen(HTML,'<p>','<p>',false,curPos);        //Strings which opens/closes the data. WEB_SPECIFIC
        ItemValue:=StringReplace(ItemValue, 'Edit', '', true, false, true);  //Cleaning. WEB_SPECIFIC.      
        ItemValue:=StringReplace(ItemValue, 'Industry information at your fingertips', '', true, false, true);  //Cleaning. WEB_SPECIFIC.
        ItemValue:=StringReplace(ItemValue, 'Some parts of this page won'+#39+'t work property. Please reload or try later.', '', true, false, true); //Cleaning. WEB_SPECIFIC.
        curPos:=Pos('—',ItemValue);                               //WEB_SPECIFIC.
        If 0<curPos then ItemValue:=Copy(ItemValue,0,curPos-1);
        curPos:=Pos('Written by ',ItemValue);                               //WEB_SPECIFIC.
        If 0<curPos then ItemValue:=Copy(ItemValue,0,curPos-1);   
        curPos:=Pos('Plot ',ItemValue);                               //WEB_SPECIFIC.
        If 0<curPos then ItemValue:=Copy(ItemValue,0,curPos-1);      
        //AddFieldValueXML('description',ItemValue);
        LogMessage('      Get result description:'+#13+ItemValue+'||');
    End;   
« Last Edit: July 20, 2022, 04:25:45 pm by Ivek23 »
Ivek23
Win 10 64bit (32bit)   PVD v0.9.9.21, PVD v1.0.2.7, PVD v1.0.2.7 + MOD


Offline Pacifist

  • User
  • ***
  • Posts: 68
    • View Profile
thanks for keeping the project alive.  ;)

Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 2711
    • View Profile
thanks for keeping the project alive.  ;)

Welcome.
Ivek23
Win 10 64bit (32bit)   PVD v0.9.9.21, PVD v1.0.2.7, PVD v1.0.2.7 + MOD


 

anything