Author Topic: curl - solution for https  (Read 46484 times)

0 Members and 2 Guests are viewing this topic.

Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 2711
    • View Profile
curl - solution for https
« on: October 30, 2016, 11:06:45 am »
PS: Ivek don't worry about the https issue. There is a roundabout: we can use a external program (I think in curl https://curl.haxx.se/) to download the page to a file and scrap it with PVdB.

Perhaps this is a good solution, but would need help because I'm not a programmer and this is part of the problem (do not know how to use).

So please someone with a better knowledge of these matters to help and explain how it is used
Ivek23
Win 10 64bit (32bit)   PVD v0.9.9.21, PVD v1.0.2.7, PVD v1.0.2.7 + MOD


Offline VVV_Easy_Programing

  • Older Power User
  • *****
  • Posts: 199
    • View Profile
Re: curl - solution for https
« Reply #1 on: October 30, 2016, 09:34:53 pm »
You don't need to program out of PVdB. curl.exe is a program that download a web page (even https) to a file in the command line so:

In 'GetDownloadURL' PVdB function:
1) Download the page https (for instance: https://www.themoviedb.org/movie/238-the-godfather) with the next command using FileExecute PVdB procedure:
curl -s -o downpage.htm https://www.themoviedb.org/movie/238-the-godfather
(Now you have the page in the file 'downpage.htm' . Easy to scrap ¿no?)

2) Cheat the 'GET' PVdB funtion as roundabout to the "https" fail with a false URL return BASE_URL_RONDABOUT = 'RONDABOUT'.
(You can get inspiration in my TheMovieDB_[ES].psf script, or, perhaps you can use a file as dummy search string as in my Several_File_Infos.psf)

3) When PVdB return 'empty' to the obligatory callback function ParsePage you can parse the page file with HTML:=FileToString(downpage.htm');

Well, I hope it serve you as inspiration (I explain it fast because I don't have much time).
Needed external:
1) Download the curl-7.50.2-win32-mingw.7z file from https://bintray.com/artifact/download/vszakats/generic/curl-7.50.2-win32-mingw.7z   (Thanks Viktor Szakáts).
2)Extract the three curl libraries files and copy then to script folder:
• curl-7.50.2-win32-mingw\bin\curl.exe
• curl-7.50.2-win32-mingw\bin\curl-ca-bundle.crt
• curl-7.50.2-win32-mingw\bin\libcurl.dll


Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 2711
    • View Profile
Re: curl - solution for https
« Reply #2 on: October 31, 2016, 04:13:01 pm »
Thank you for this clarification.

2) Cheat the 'GET' PVdB funtion as roundabout to the "https" fail with a false URL return BASE_URL_RONDABOUT = 'RONDABOUT'.
(You can get inspiration in my TheMovieDB_[ES].psf script, or, perhaps you can use a file as dummy search string as in my Several_File_Infos.psf)

3) When PVdB return 'empty' to the obligatory callback function ParsePage you can parse the page file with HTML:=FileToString(downpage.htm');

Well, I hope it serve you as inspiration (I explain it fast because I don't have much time).

Your form of written scripts never right I understood and was most clear understanding of the code written in them. I never even taught the basics of computing. so I Pascal and similar matters do a lot of difficulty in understanding and script scripts (plugins never and in the future I will not write). Many we help pre-written scripts, because it is then easier to write something himself. So also in this case here come in handy some simple script to help for the future.

Thanks for the help.
Ivek23
Win 10 64bit (32bit)   PVD v0.9.9.21, PVD v1.0.2.7, PVD v1.0.2.7 + MOD


Offline VVV_Easy_Programing

  • Older Power User
  • *****
  • Posts: 199
    • View Profile
Re: curl - solution for https
« Reply #3 on: November 01, 2016, 09:55:56 pm »
¡How can I refuse help you!

I changed a little your Rottentomatoes script to resolve with curl the https issue:
¡First, do a backup of your PVdb folder!
Descompress the attach in your PVdB folder (all goes to scripts folder, I must make two files in to messages because the maximum length it limited in the forum) and change in the script the PVdB_SCRIPTS_PATH_FOLDER.
Launch PVdB in debug mode (debug.bat) and you can see what the script do in the  Help/Log window.

Some explanation of PVdB Import operation:
1) It use the 'GetBaseURL' result to try download the movie page directly. So you must cheat with a false url for avoid download the https page. so ¿you need search the movie always? Don't worry you search the stored movie URL by yourself in 'GetDownloadURL'
2) PVdB don't have a URL so it use 'GetDownloadURL' for download it.
ROUNDABOUT: In this function we download the page (movie, search, etc) with curl and give it a file in the place of a URL.
2.1) If we are in search mode we look for stored movie URL, download the page, change a mode normal and return the file with the page.
2.2) If we don't have URL, we download the search page, continue in search mode and return the file with the page.
2.3) If other modes, you can do similar changing the script mode in order to pass the information to the 'Parsepage' PBdV function.
3) PVdB goes to 'ParsePage' to work over HTML variable. Nothing it's diferent here: HTML have the page info (same behaviour file or web), script mode for knows the type of page information (search, movie, ratting, poster, etc).

Some advices:
1) For editing I use PSPad with 'Object Pascal' highlighted syntax (easier detects some error)
2) I save the html page and I put the script and the page in the editor 'vertical half window mosaic' and it's easy to avance scraping the information.
3) Use, use, use the 'LogMessage' command with the PVdB Help/Log window in debug.bat mode. First you can see the compiling errors. Running, you can see the script flow and the variable contents.
4) The PVdB script is nearly and word process search, find and copy. Don't be afraid of programming. BTW, now I use a lot 'TextBetWeen' function for retrieving info. You can see in my FilmAffinity_[ES].psf:
      //Get ~orating~
      ItemValue:=TextBetWeen(HTML,'<div id="rat-avg-container">','</div>',false,curPos);     //Strings which opens/closes the data. WEB_SPECIFIC
      ItemValue:=StringReplace(ItemValue,',','.',True,True,False); //Decimal comma spanish separator to point english separator.
      AddFieldValueXML('orname',RATING_NAME);
      AddFieldValueXML('orating',ItemValue);
      LogMessage('      Get result orating:'+ItemValue+'||');

BTW You can see the highlighted syntax effect and the use of 'LogMessage'
5) Now, I only use the ParsePage (it obligatory) for scrap and I only have one flow. I diference the script mode with:
      if (Mode=smSearch) then begin       //In search Mode
          ......
          Mode:=smNormal;
          Result:=prList;     //Don't work with Preferences/Plugings/Silent Mode.
          LogMessage('After parsing search Movies go to choose List Results');   
          Exit;
      end;
      if (Mode=smNormal) then begin        //In normal, movie info, Mode
          ......
          Result:=prFinished;
          exit;
     end;

I hope that this help you, I try the 'curl Rottentomatoes script' and it download well the info but I don't have time to make the scrap (the information search of the script). Tell me your avance and problems.
(Rename the attach as Script.001.zip -> Scripts.zip.001)

Offline VVV_Easy_Programing

  • Older Power User
  • *****
  • Posts: 199
    • View Profile
Re: curl - solution for https
« Reply #4 on: November 01, 2016, 09:57:26 pm »

Rename the attach as Script.002.zip -> Scripts.zip.002. Decompress the first with 7zip. It call to the second automatically.


Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 2711
    • View Profile
Re: curl - solution for https
« Reply #5 on: November 02, 2016, 12:21:24 pm »
Thank you for your efforts.

This for me still does not operate as it should be. I changed the PVD path address for program in the script but there is no effect (there is no data transmission or web page to a search movie). It does not automatically create and, consequently, it does not find the  'downpage.htm' file. However, if you create a single, but there is no data in it, the file is empty.
Ivek23
Win 10 64bit (32bit)   PVD v0.9.9.21, PVD v1.0.2.7, PVD v1.0.2.7 + MOD


Offline VVV_Easy_Programing

  • Older Power User
  • *****
  • Posts: 199
    • View Profile
Re: curl - solution for https
« Reply #6 on: November 02, 2016, 07:30:51 pm »
Well, nobody say it was be easy. We can begin the "debug":
1) ¿Do you have this (Script_Folder.jpg) in your PVdB Scripts Folder?





2) ¿Do you change
      PVdB_SCRIPTS_PATH_FOLDER   = 'C:\Users\Public\Portables\PersonalVideoDB\Scripts\'//The PVdB scripts path folder
to your own script folder (don't forget the last '\')
3) Try with an movie example with his BASE_URL not httpS (Capture2.jpg)




and Import with Rottentomatoes_[HTTPS].psf in debug mode. ¿can you see in the Log Windows, something like this (LogWindow.jpg)?





If not ¿can you attach your Log window?
« Last Edit: November 02, 2016, 07:39:22 pm by VVV_Easy_Programing »

Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 2711
    • View Profile
Re: curl - solution for https
« Reply #7 on: November 03, 2016, 06:55:56 am »
Quote
1) ¿Do you have this (Script_Folder.jpg) in your PVdB Scripts Folder?
Yes

Quote
2) ¿Do you change
      PVdB_SCRIPTS_PATH_FOLDER   = 'C:\Users\Public\Portables\PersonalVideoDB\Scripts\';  //The PVdB scripts path folder
to your own script folder (don't forget the last '\')
Yes

Quote
3) Try with an movie example with his BASE_URL not httpS (Capture2.jpg)
Yes

Quote
If not ¿can you attach your Log window?

Log window
Code: [Select]
(3.11.2016 6:54:23) Compiling script: Rottentomatoes_[HTTPS].psf
(3.11.2016 6:54:23) Script compiled successfully: Rottentomatoes_[HTTPS].psf
(3.11.2016 6:54:23) Executing script binary
(3.11.2016 6:54:23) Logging in...
(3.11.2016 6:54:23) Function GetDownloadURL|
(3.11.2016 6:54:23) Global Var-Mode|0|
(3.11.2016 6:54:23) Dowloand with curl in C:\Program Files\Personal Video Database\Scripts\downpage.htm the information of:|https://www.rottentomatoes.com/m/godfather||
(3.11.2016 6:54:25) Parse stored information of:|https://www.rottentomatoes.com/m/godfather||
(3.11.2016 6:54:25) Searching movie information for: Godfather
(3.11.2016 6:54:39) Function ParsePage|
(3.11.2016 6:54:39) Global Var-Mode|1|
(3.11.2016 6:54:39) Local Var-URL||
(3.11.2016 6:54:39) Local Var-HTML.Begin||End
Ivek23
Win 10 64bit (32bit)   PVD v0.9.9.21, PVD v1.0.2.7, PVD v1.0.2.7 + MOD


Offline VVV_Easy_Programing

  • Older Power User
  • *****
  • Posts: 199
    • View Profile
Re: curl - solution for https
« Reply #8 on: November 03, 2016, 08:46:05 pm »
Bad news.
 I don't see an special authorization for curl in my windows firewall.
         My firewall is windows  ¿Do you have a firewall different that windows standard?
         I have Firefox installed ¿Do you have it?
         My antivirus is avast ¿do you have a especial antivirus?
         ¿What it is the content of your downloaded file? ¿Some like this or nothing at all?
                 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
                 <html><head>
                <title>403 Forbidden</title>

¿Can you try this commands in DOS Windows in the script folder?

1)Download a httpS in file not silent
curl.exe -o downpage_try.htm https://www.rottentomatoes.com/m/godfather

I get the page and curl shows
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  222k  100  222k    0     0  87034      0  0:00:02  0:00:02 --:--:-- 87034


2)Download a http in file not silent
curl.exe -o downpage_try.htm http://www.imdb.com/title/tt0068646/?ref_=fn_al_tt_7

I get the page and curl shows
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  197k    0  197k    0     0   214k      0 --:--:-- --:--:-- --:--:--  214k


3)Download a http in file not silent with identificated agent:
curl -A "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; .NET CLR 1.1.4322)" -o downpage_try.htm http://www.filmaffinity.com/es/film809297.html

I get the page and curl shows
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Curren
                                 Dload  Upload   Total   Spent    Left  Speed
100 48198    0 48198    0     0   336k      0 --:--:-- --:--:-- --:--:--  336k



4)If it fail try (verbose, for know if curl says something):
curl.exe -v -o downpage_try.htm http://www.imdb.com/title/tt0068646/?ref_=fn_al_tt_7


« Last Edit: November 03, 2016, 09:03:42 pm by VVV_Easy_Programing »

Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 2711
    • View Profile
Re: curl - solution for https
« Reply #9 on: November 04, 2016, 07:43:36 pm »
Quote
My firewall is windows  ¿Do you have a firewall different that windows standard?
Yes

Quote
I have Firefox installed ¿Do you have it?
Yes

Quote
My antivirus is avast ¿do you have a especial antivirus?
NOD32

Quote
¿What it is the content of your downloaded file? ¿Some like this or nothing at all?
                 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
                 <html><head>
                <title>403 Forbidden</title>
Nothing at all

Quote
¿Can you try this commands in DOS Windows in the script folder?

1)Download a httpS in file not silent
curl.exe -o downpage_try.htm https://www.rottentomatoes.com/m/godfather

I get the page and curl shows
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  222k  100  222k    0     0  87034      0  0:00:02  0:00:02 --:--:-- 87034
Could not resolve host:  Files\Personal
Ivek23
Win 10 64bit (32bit)   PVD v0.9.9.21, PVD v1.0.2.7, PVD v1.0.2.7 + MOD


Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 2711
    • View Profile
Re: curl - solution for https
« Reply #10 on: November 04, 2016, 07:54:19 pm »
Obviously I'm doing something wrong or I overlooked because anything that you mentioned change in the code is not working, so please for the whole part of the revised code.

Full address of where it is located PVD v0.9.9.21 is as follows:

C:\Program Files\Personal Video Database\Scripts

Btw:
In which folder should be downpage_try.htm or downpage.htm file.
Ivek23
Win 10 64bit (32bit)   PVD v0.9.9.21, PVD v1.0.2.7, PVD v1.0.2.7 + MOD


Offline VVV_Easy_Programing

  • Older Power User
  • *****
  • Posts: 199
    • View Profile
Re: curl - solution for https
« Reply #11 on: November 04, 2016, 09:38:33 pm »
Don't worry, all used and created files (downpage.htm) must be in the PVdB Scripts folder.
We must do two tries.
1) Use the new script. It automatically detects the scripts folder were must be the Rottentomatoes_[HTTPS].psf and the three curl files. Send to the forum the Log windows with the http://www.rottentomatoes.com/m/godfather try PVdB movie (don't worry if it don't scrapt the information).
2) Decompress the curl_try.zip in the script folder (ever with the three curl files), run curl_try.bat it in DOS windows and copy the DOS text result to the forum.

When we resolves the curl issue we can begin with the scrap information work.
« Last Edit: November 06, 2016, 08:38:00 pm by VVV_Easy_Programing »

Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 2711
    • View Profile
Re: curl - solution for https
« Reply #12 on: November 05, 2016, 11:20:10 am »
Quote
1) Use the new script. It automatically detects the scripts folder were must be the Rottentomatoes_[HTTPS].psf and the three curl files. Send to the forum the Log windows with the http://www.rottentomatoes.com/m/godfather try PVdB movie (don't worry if it don't scrapt the information).

Code: [Select]
?»?(5.11.2016 10:56:31) PVD Version: 0.9.9.21
(5.11.2016 10:56:31) OS: Windows 7 Home Basic Edition
(5.11.2016 10:56:31) Plugin loaded: amazon.dll 0.4.4.9
(5.11.2016 10:56:31) Plugin loaded: amazonde.dll 0.1.2.9
(5.11.2016 10:56:32) Plugin loaded: amazonfr.dll 0.1.1.8
(5.11.2016 10:56:32) Plugin loaded: amazonuk.dll 0.1.1.8
(5.11.2016 10:56:33) Plugin loaded: plainexp.dll 0.7.1.1
(5.11.2016 10:56:33) Plugin loaded: scriptint.dll 0.2.6.1
(5.11.2016 10:56:33) Plugin loaded: valueconvert.dll 0.1.0.2
(5.11.2016 10:56:33) Compiling script: Rottentomatoes_[HTTPS].psf
(5.11.2016 10:56:33) Script compiled successfully: Rottentomatoes_[HTTPS].psf
(5.11.2016 10:56:33) Executing script binary
(5.11.2016 10:56:33) Script loaded: Rottentomatoes_[HTTPS].psf 0.2.0.0
(5.11.2016 10:56:34) Loading database: C:\Users\Ivo\Documents\Personal Video Database\movies.pvd
(5.11.2016 10:56:34) UpdateToolbar: 1
(5.11.2016 10:56:36) UpdateToolbar: 2
(5.11.2016 10:56:36) UpdateToolbar: 3
(5.11.2016 10:56:41) Compiling script: Rottentomatoes_[HTTPS].psf
(5.11.2016 10:56:41) Script compiled successfully: Rottentomatoes_[HTTPS].psf
(5.11.2016 10:56:41) Executing script binary
(5.11.2016 10:56:41) Logging in...
(5.11.2016 10:56:41) Function GetDownloadURL|
(5.11.2016 10:56:41) Global Var-Mode|0|
(5.11.2016 10:56:41) Global Var-DownloadURL||
(5.11.2016 10:56:41)       Local Var-ScriptPath|C:\Program Files\Personal Video Database\Scripts\|
(5.11.2016 10:56:41)       Download with curl in file:|C:\Program Files\Personal Video Database\Scripts\downpage.htm| the information of:|https://www.rottentomatoes.com/m/godfather||
(5.11.2016 10:56:43)       Return for parsing movie with stored information of:|https://www.rottentomatoes.com/m/godfather||
(5.11.2016 10:56:43) Searching movie information for: Godfather
(5.11.2016 10:56:43) Function ParsePage|
(5.11.2016 10:56:43) Global Var-Mode|1|
(5.11.2016 10:56:43) Global Var-DownloadURL|https://www.rottentomatoes.com/m/godfather|
(5.11.2016 10:56:43) Local Var-URL||
(5.11.2016 10:56:43) Local Var-HTML.Begin||End

Quote
2) Decompress the curl_try.zip in the script folder (ever with the three curl files), run curl_try.bat it in DOS windows and copy the DOS text result to the forum.

Code: [Select]
2) Decompress the curl_try.zip in the script folder (ever with the three curl files), run curl_try.bat it in DOS windows and copy the DOS text result to the forum.
This is the result of tests according to your instructions without proxy settings. Previously it did not work properly due to included proxy settings. Proxy settings I used in testing with Proxomitron
Ivek23
Win 10 64bit (32bit)   PVD v0.9.9.21, PVD v1.0.2.7, PVD v1.0.2.7 + MOD


Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 2711
    • View Profile
Re: curl - solution for https
« Reply #13 on: November 05, 2016, 11:23:06 am »
I did some testing with curl_try file, which I changed settings, leaving only this setting;
Code: [Select]
curl.exe -o downpage.htm https://www.rottentomatoes.com/m/godfather
I fixed the Rottentomatoes_ [HTTPS] script and now everything is working as it should.

Correct Rottentomatoes_ [HTTPS] script is attached.
Ivek23
Win 10 64bit (32bit)   PVD v0.9.9.21, PVD v1.0.2.7, PVD v1.0.2.7 + MOD


Offline VVV_Easy_Programing

  • Older Power User
  • *****
  • Posts: 199
    • View Profile
Re: curl - solution for https
« Reply #14 on: November 05, 2016, 12:19:57 pm »
Good news.
I must ask you some questions.
¿Do you want make a complete script or only one for your custom fields?
If complete: ¿Do you want that I help you with the script? ¿May I you propose some improvements?
1) I attach: 0.2.0.3 (05/11/2016) -> VVV: Some improvements and explanations in code. without touch the code
2) For touching the code:
    I see that the structure is a little bit confusing ¿do you want that I arrange it?
    I see that the script use AddFieldValue that is deprecated. It better AddFieldValueXML ¿do you want that I arrange it?
    I think that the ParseSearchResults don't work ¿do you want that I programme it?

I don't maintain another script but I can help you.
« Last Edit: November 06, 2016, 08:37:41 pm by VVV_Easy_Programing »

Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 2711
    • View Profile
Re: curl - solution for https
« Reply #15 on: November 05, 2016, 01:41:49 pm »
Quote
2) For touching the code:
    I see that the structure is a little bit confusing ¿do you want that I arrange it?
    I see that the script use AddFieldValue that is deprecated. It better AddFieldValueXML ¿do you want that I arrange it?
    I think that the ParseSearchResults don't work ¿do you want that I programme it?

Yes please, because the script with additional score in custom fields and search results will also be of use for other users of this forum.

For me personally it should be added some additional information in a common custom field. What and how I had arranged you can see in the below attached RottenTomatoes (full) script. This should be combined into a single script with the possibility for publication may then remain only just additional score in the custom fields, if possible to regulate.

Thank you for your work and effort already in advance.


Btw:
How to edit or connect now all files from curl_try to Rottentomatoes_ [HTTPS] in one executable file.

Will now required for each movie in curl_try file manually enter the URL address of the movie, which is of course more movies lengthy work, or it will be automatically regulated.
Ivek23
Win 10 64bit (32bit)   PVD v0.9.9.21, PVD v1.0.2.7, PVD v1.0.2.7 + MOD


Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 2711
    • View Profile
Re: curl - solution for https
« Reply #16 on: November 06, 2016, 10:05:02 am »
Quote
I think that the ParseSearchResults don't work ¿do you want that I programme it?

Yes, it really does not work.

When I changed the code in the procedure ParseSearchResults,
Code: [Select]
procedure ParseSearchResults(HTML : String);
var
curPos, endPos : Integer;
Title, Year, URL, Preview : String;
begin
    LogMessage('Function ParseSearchResults|');
    LogMessage('Global Var-Mode|'+IntToStr(Mode)+'|');
    LogMessage('Global Var-DownloadURL|'+DownloadURL+'|');

curPos := Pos('<h1>Search Results for : ', HTML);
if curPos < 1 then
Exit;

LogMessage('Parsing search results...');

curPos := PosFrom('<a class="unstyled articleLink" href="/m/', HTML, curPos);
while curPos > 0 do begin
endPos := PosFrom('">', HTML, curPos);
URL := 'http://www.rottentomatoes.com'+Trim(Copy(HTML, curPos+38, endPos - curPos-38));

//curPos := PosFrom('">', HTML, curPos);
//endPos := PosFrom('</a>', HTML, curPos);
Title := TextBetween(HTML, '">', '</a>', True, curPos); 

//curPos := PosFrom('<span class="movie_year">', HTML, curPos);
//endPos := PosFrom('</span> </div>', HTML, curPos);
Year := TextBetween(HTML, '<span class="movie_year">', '</span>', True, curPos);

AddSearchResult(Title+' '+Year, '', '', URL, '');

curPos := PosFrom('<a class="unstyled articleLink" href="/m/', HTML, curPos);
  end;

end;
search results now operates.

BTW:
Quote
¿Do you want make a complete script or only one for your custom fields?

It can also be only one for custom fields, if you can, no problem.
Ivek23
Win 10 64bit (32bit)   PVD v0.9.9.21, PVD v1.0.2.7, PVD v1.0.2.7 + MOD


Offline VVV_Easy_Programing

  • Older Power User
  • *****
  • Posts: 199
    • View Profile
Re: curl - solution for https
« Reply #17 on: November 06, 2016, 08:34:08 pm »
Attach you have a functional version:
VERSION:    0.3.0.0 (06/11/2016) -> VVV: Rebuild full script.

It retrieve the principal infos (you could complete the rest, I left hints in the code).
I don't now all the custom fields. I retrieved some ratings but I left hints and questions in the code.
Not poster or screenshot (if you have the https link I could retrieve it with curl and charge in PVdB by file but it's not easy to find in the page).

I see that RT is a slowly page, don't worry if the DOS windows stops a 5 or 6 seconds when you run in PVdB, the script waits (this was a big problem. now is solved).

Quote
How to edit or connect now all files from curl_try to Rottentomatoes_ [HTTPS] in one executable file.
The script works directly in PVdB without need get out to DOS (curl_try was only a check).

To spread in internet, the best form to save the 4 needed files:
              Scripts/Rottentomatoes_[HTTPS].psf,
              Scripts/curl-ca-bundle.crt,
              Scripts/curl.exe,
              Scripts/libcurl.dll
is compressed in a zip file but the length is out of the limit of the forum. Now, the script checks that all files are present.

If you want to download the original ones: https://bintray.com/artifact/download/vszakats/generic/curl-7.50.2-win32-mingw.7z (The needed files were in "curl-7.50.2-win32-mingw\bin\" folder)

Quote
Will now required for each movie in curl_try file manually enter the URL address of the movie, which is of course more movies lengthy work, or it will be automatically regulated.
Now the script have a automatical full search, but it must save in the 'http' (not S) base url. For back compatibility and for prevents PVdB crash. Do the same if you write it manually.

Finally, it's not difficult to adapt this script to another HTTPS page. All people are free to use it.
« Last Edit: November 08, 2016, 08:09:17 pm by VVV_Easy_Programing »

Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 2711
    • View Profile
Re: curl - solution for https
« Reply #18 on: November 08, 2016, 09:02:53 am »
A big thank you to the script, effort and work around it.

This script has one big mistake, that is not working properly as shown in the log file.
Code: [Select]
(8.11.2016 8:46:09) PVD Version: 0.9.9.21
(8.11.2016 8:46:10) OS: Windows 7 Home Basic Edition
(8.11.2016 8:46:10) Plugin loaded: plainexp.dll 0.7.1.1
(8.11.2016 8:46:10) Plugin loaded: scriptint.dll 0.2.6.1
(8.11.2016 8:46:10) Plugin loaded: valueconvert.dll 0.1.0.2
(8.11.2016 8:46:10) Compiling script: Rottentomatoes_[HTTPS].psf
(8.11.2016 8:46:10) Script compiled successfully: Rottentomatoes_[HTTPS].psf
[Hint] (199:1): Variable 'ITEMSUBLIST' never used
(8.11.2016 8:46:10) Executing script binary
(8.11.2016 8:46:10) Script loaded: Rottentomatoes_[HTTPS].psf 0.2.0.3
(8.11.2016 8:46:10) Loading database: C:\Users\Ivo\Documents\Personal Video Database\movies.pvd
(8.11.2016 8:46:11) UpdateToolbar: 1
(8.11.2016 8:46:12) UpdateToolbar: 2
(8.11.2016 8:46:13) UpdateToolbar: 3
(8.11.2016 8:46:37) UpdateToolbar: 4
(8.11.2016 8:46:45) GET: http://www.videodb.info/upload/check.php
(8.11.2016 8:46:47) Update error: Socket Error # 10061
Connection refused.
(8.11.2016 8:46:52) UpdateToolbar: 5
(8.11.2016 8:47:03) Compiling script: Rottentomatoes_[HTTPS].psf
(8.11.2016 8:47:03) Script compiled successfully: Rottentomatoes_[HTTPS].psf
[Hint] (199:1): Variable 'ITEMSUBLIST' never used
(8.11.2016 8:47:03) Executing script binary
(8.11.2016 8:47:03) Logging in...
(8.11.2016 8:47:03) Function GetDownloadURL======================|
(8.11.2016 8:47:03) Global Var-Mode|0|
(8.11.2016 8:47:03) Global Var-DownloadURL||
(8.11.2016 8:47:03) Global Var-ScriptPath|C:\Program Files\Personal Video Database\Scripts\|
(8.11.2016 8:47:03)       Waiting 1s for delete:C:\Program Files\Personal Video Database\Scripts\downpage.htm
(8.11.2016 8:47:04)       Waiting 1s for delete:C:\Program Files\Personal Video Database\Scripts\downpage.htm
(8.11.2016 8:47:05)       Waiting 1s for delete:C:\Program Files\Personal Video Database\Scripts\downpage.htm
(8.11.2016 8:47:06)       Waiting 1s for delete:C:\Program Files\Personal Video Database\Scripts\downpage.htm
(8.11.2016 8:47:07)       Waiting 1s for delete:C:\Program Files\Personal Video Database\Scripts\downpage.htm
(8.11.2016 8:47:08)       Waiting 1s for delete:C:\Program Files\Personal Video Database\Scripts\downpage.htm
(8.11.2016 8:47:09)       Waiting 1s for delete:C:\Program Files\Personal Video Database\Scripts\downpage.htm
(8.11.2016 8:47:11)       Waiting 1s for delete:C:\Program Files\Personal Video Database\Scripts\downpage.htm
(8.11.2016 8:47:12)       Waiting 1s for delete:C:\Program Files\Personal Video Database\Scripts\downpage.htm
(8.11.2016 8:47:13)       Waiting 1s for delete:C:\Program Files\Personal Video Database\Scripts\downpage.htm
(8.11.2016 8:47:14)       Waiting 1s for delete:C:\Program Files\Personal Video Database\Scripts\downpage.htm
(8.11.2016 8:47:15)       Waiting 1s for delete:C:\Program Files\Personal Video Database\Scripts\downpage.htm

This always happens when a URL (http or https or it is not. The error caused by this part of the code.
Code: [Select]
    //Delete the ancient dowloaded page file. Needed for wait to curl download.       
    While FileExists(ScriptPath+BASE_DOWNLOAD_FILE) do begin
         FileExecute('cmd.exe', '/C del '+ScriptPath+BASE_DOWNLOAD_FILE);
         LogMessage('      Waiting 1s for delete:'+ScriptPath+BASE_DOWNLOAD_FILE);
         wait (1000);
    end;


If this part of the code is blocked, the script is working normally as shown in the log file.,
Code: [Select]
?»?(8.11.2016 8:54:26) PVD Version: 0.9.9.21
(8.11.2016 8:54:27) OS: Windows 7 Home Basic Edition
(8.11.2016 8:54:27) Plugin loaded: plainexp.dll 0.7.1.1
(8.11.2016 8:54:27) Plugin loaded: scriptint.dll 0.2.6.1
(8.11.2016 8:54:27) Plugin loaded: valueconvert.dll 0.1.0.2
(8.11.2016 8:54:27) Compiling script: Rottentomatoes_[HTTPS].psf
(8.11.2016 8:54:27) Script compiled successfully: Rottentomatoes_[HTTPS].psf
[Hint] (199:1): Variable 'ITEMSUBLIST' never used
(8.11.2016 8:54:27) Executing script binary
(8.11.2016 8:54:27) Script loaded: Rottentomatoes_[HTTPS].psf 0.2.0.3
(8.11.2016 8:54:27) Loading database: C:\Users\Ivo\Documents\Personal Video Database\movies.pvd
(8.11.2016 8:54:28) UpdateToolbar: 1
(8.11.2016 8:54:30) UpdateToolbar: 2
(8.11.2016 8:54:30) UpdateToolbar: 3
(8.11.2016 8:54:58) Compiling script: Rottentomatoes_[HTTPS].psf
(8.11.2016 8:54:58) Script compiled successfully: Rottentomatoes_[HTTPS].psf
[Hint] (199:1): Variable 'ITEMSUBLIST' never used
(8.11.2016 8:54:58) Executing script binary
(8.11.2016 8:54:58) Logging in...
(8.11.2016 8:54:58) Function GetDownloadURL======================|
(8.11.2016 8:54:58) Global Var-Mode|0|
(8.11.2016 8:54:58) Global Var-DownloadURL||
(8.11.2016 8:54:58) Global Var-ScriptPath|C:\Program Files\Personal Video Database\Scripts\|
(8.11.2016 8:54:58)       Some error deleting: C:\Program Files\Personal Video Database\Scripts\downpage.htm
(8.11.2016 8:54:58)       Download with curl in file:|C:\Program Files\Personal Video Database\Scripts\downpage.htm| the information of:|https://www.rottentomatoes.com/m/godfather||
(8.11.2016 8:55:00)       Return for parsing movie with stored information of:|https://www.rottentomatoes.com/m/godfather||
(8.11.2016 8:55:02)       Now present file: C:\Program Files\Personal Video Database\Scripts\downpage.htm
(8.11.2016 8:55:02) Searching movie information for: Godfather
(8.11.2016 8:55:02) Function ParsePage======================|
(8.11.2016 8:55:02) Global Var-Mode|1|
(8.11.2016 8:55:02) Global Var-DownloadURL|https://www.rottentomatoes.com/m/godfather|
(8.11.2016 8:55:02) Global Var-ScriptPath|C:\Program Files\Personal Video Database\Scripts\|
(8.11.2016 8:55:02)        Local Var-URL||
(8.11.2016 8:55:02)      Parsing movie page
(8.11.2016 8:55:02)       Get result url:http://www.rottentomatoes.com/m/godfather||
(8.11.2016 8:55:02)       Get result title:The Godfather||
(8.11.2016 8:55:02)       Get result year:1972||
(8.11.2016 8:55:02)       Get result description:Popularly viewed as one of the best American films ever made, the multi-generational crime saga The Godfather is a touchstone of cinema: one of the most widely imitated, quoted, and lampooned movies of all time. Marlon Brando and Al Pacino star as Vito Corleone and his youngest son, Michael, respectively. It is the late 1940s in New York and Corleone is, in the parlance of organized crime, a "godfather" or "don," the head of a Mafia family. Michael, a free thinker who defied his father by enlisting in the Marines to fight in World War II, has returned a captain and a war hero. Having long ago rejected the family business, Michael shows up at the wedding of his sister, Connie (Talia Shire), with his non-Italian girlfriend, Kay (Diane Keaton), who learns for the first time about the family "business." A few months later at Christmas time, the don barely survives being shot by gunmen in the employ of a drug-trafficking rival whose request for aid from the Corleones' political connections was rejected. After saving his father from a second assassination attempt, Michael persuades his hotheaded eldest brother, Sonny (James Caan), and family advisors Tom Hagen (Robert Duvall) and Sal Tessio (Abe Vigoda) that he should be the one to exact revenge on the men responsible. After murdering a corrupt police captain and the drug trafficker, Michael hides out in Sicily while a gang war erupts at home. Falling in love with a local girl, Michael marries her, but she is later slain by Corleone enemies in an attempt on Michael's life. Sonny is also butchered, having been betrayed by Connie's husband. As Michael returns home and convinces Kay to marry him, his father recovers and makes peace with his rivals, realizing that another powerful don was pulling the strings behind the narcotics endeavor that began the gang warfare. Once Michael has been groomed as the new don, he leads the family to a new era of prosperity, then launches a campaign of murderous revenge against those who once tried to wipe out the Corleones, consolidating his family's power and completing his own moral downfall. Nominated for 11 Academy Awards and winning for Best Picture, Best Actor (Marlon Brando), and Best Adapted Screenplay, The Godfather was followed by a pair of sequels. ~ Karl Williams, Rovi||
(8.11.2016 8:55:02)       Get result mpaa:R (N/A)||
(8.11.2016 8:55:02)            Parse results List Genre:Drama||
(8.11.2016 8:55:02)       Get results Genre:Drama||
(8.11.2016 8:55:02)            Parse results List Directors:Francis Ford Coppola||
(8.11.2016 8:55:02)       Get results Directors:Francis Ford Coppola||
(8.11.2016 8:55:02)            Parse results List Writers:Francis Ford Coppola                , Mario Puzo||
(8.11.2016 8:55:02)       Get results Writers:Francis Ford Coppola||
(8.11.2016 8:55:02)       Get results Writers:Mario Puzo||
(8.11.2016 8:55:02)       Get result ĹĽTOMATOMETER All critics?:99||
(8.11.2016 8:55:02)       Get result RT average rating All critics:8.8||
(8.11.2016 8:55:02)       Get result ĹĽTOMATOMETER Top critics?:95||
(8.11.2016 8:55:02)       Get result RT average rating Top critics:8.8||
(8.11.2016 8:55:02)       Get result ĹĽAUDIENCE SCORE?:98||
(8.11.2016 8:55:02)       Get result RT average rating audience:4.4/5||
(8.11.2016 8:55:02) Script end after retreived info of movie
(8.11.2016 8:55:14) UpdateToolbar: 4
(8.11.2016 8:55:14) UpdateToolbar: 5

As you yourself have observed before, that you can not find the reason why this part of the code does not work.

Perhaps to this extent to make any corrections.

BTW:
Version of our script, which you mentioned before(version 0.3.0.0.) is not the same as written in the script(version 0.2.0.3).


Ivek23
Win 10 64bit (32bit)   PVD v0.9.9.21, PVD v1.0.2.7, PVD v1.0.2.7 + MOD


Offline Ivek23

  • Global Moderator
  • *****
  • Posts: 2711
    • View Profile
Re: curl - solution for https
« Reply #19 on: November 08, 2016, 09:42:57 am »
Quote
Quote
How to edit or connect now all files from curl_try to Rottentomatoes_ [HTTPS] in one executable file.
The script works directly in PVdB without need get out to DOS (curl_try was only a check).

It is still necessary to manually in curl_try file to write the url address to search results or the URL for a particular movie to curl transferred information in downpage.htm file.

How do I arrange that, for example, when you run curl_try file to download data for search results, then it does not overwrite data for a particular movie in downpage.htm file but that these data are added. Now it is necessary to copy the data to search for prior information about the movie in downpage.htm file.

Whether such an option might be to add a script that would not be necessary prior to the transfer of data to the movie always restart the curl_try file.
Ivek23
Win 10 64bit (32bit)   PVD v0.9.9.21, PVD v1.0.2.7, PVD v1.0.2.7 + MOD