Jump to content
Sign in to follow this  
COMFIED

Scraping Website Text

Recommended Posts

I'm trying to download web text content into either a memo field or string variable.

 

Most examples given in this forum and the documentation work on the IDE browser, but not on external Chrome, Safari or IE browsers.

 

 

For example, this code works when I click Execute on SMS IDE, but doesn't work on external browsers:

 

var
  v1, v2, v3: Variant;
begin
  //get browser title

  v1 := W3IFrameHTMLElement1.Handle.contentDocument;

  W3Memo1.Text := v1.title;

 

 //get browser content

  v2 := W3IFrameHTMLElement1.Handle.contentDocument.documentElement
  W3Memo2.Text := v2.innerHTML;

 

  //get browser links

  W3Memo3.Text := '';
  v3 := W3IFrameHTMLElement1.Handle.contentDocument.links;
  for var i := 0 to Integer(v3.length)-1 do
  begin
  W3Memo3.Text := W3Memo3.Text +#13#10 + v3.href;
  end;

 

 

Kindly assist.

Share this post


Link to post
Share on other sites

Unfortunately my SmartMS has been expired, but I suspect this is a normal behavior.

According to my understanding, cross-domain problem is when the domain of the webpage which contains the IFRAME is different
from the domain of the web-page opened in IFRAME.  It is normal behavior for a page xyz.com to load in an iframe hosted on abc.com. However, you cannot change anything or access its content via code from parent abc.com.

 

You need a web server running listening on port 8000, for instance to work as expected.

- on Initialize object set W3IFrameHTMLElement1.Src := 'http://localhost:8000/test01.htm';

- open your project in http://localhost:8000/index.html

I think this would work as expected.

 

Hope this helped.

Share this post


Link to post
Share on other sites

Thank warleyalex

 

The question is:  Is there a way to read non-json text content from any website into a memo or variable using SmartMS?  A procedure that works on non-IDE browsers.

 

The documentation examples given are all based on JSON and require host server-side enabling.  If its not possible I hope the next release of SmartSMS will fix this.

Share this post


Link to post
Share on other sites

I have found demo code on http://stackoverflow.com/questions/18384170/using-smart-mobile-studio-to-interact-with-a-mysql-database

and a hosted demo on http://www.lynkit.com.au/MySQL/

 

It works! Credit to Nico Wouterse

 

 

 

Here is the code

 

//PHP file to connect to MySQL and query for data

<?php
header("Access-Control-Allow-Origin: *");
 
$link = mysql_pconnect("www.your-domain.com", "your-db-user", "your-user-pw") or die("Could not connect");
mysql_select_db("your-mysql-db") or die("Could not select database");
 
$sql_statement = $_POST['sql_statement']; 
 
$arr = array();
 
$rs = mysql_query($sql_statement);
 
while($obj = mysql_fetch_object($rs)) {
$arr[] = $obj;
}
echo '{"smsrows":'.json_encode($arr).'}';
 
mysql_close($link);
?>
 
 
//SMARTMS CODE - UNIT TO INCLUDE IN YOUR PROJECT
Unit DS;
 
interface
 
uses
  SmartCL.System, SmartCL.Components, SmartCL.Inet;
 
type
 
  TDS = class(TW3Component)
  private
    FHttp : TW3HttpRequest;
    FSQLString : String;
    Fsmscursor: variant;
    FDBRows : Integer;
  protected
    procedure InitializeObject; override;
    procedure FinalizeObject; override;
  public
    FOnDataReady: THttpRequestEvent;
    property SQLString: String read FSQLString write FSQLString;
    property smscursor: Variant read Fsmscursor write Fsmscursor;
    property DBRows: Integer read FDBRows write FDBRows;
    property OnDataReady: THttpRequestEvent read FOnDataReady write FOnDataReady;
    procedure SQLSelect;
  end;
 
procedure Register;
 
implementation
 
procedure TDS.InitializeObject;
begin
  inherited;
  FDBRows := 0;
end;
 
procedure TDS.SQLSelect;
var
  sql_statement : String;
  encodedstr : String;
begin
  FHttp := TW3HttpRequest.Create;
  FHttp.OnDataReady := FOnDataReady; //RetrieveSQLSelect;
  FHttp.setRequestHeader("Content-type","application/x-www-form-urlencoded");
  asm @encodedstr = encodeURIComponent(@FSQLString);
  end;
 
  sql_statement := "sql_statement="+encodedstr;
  FHttp.send(sql_statement);
end;
 
procedure TDS.FinalizeObject;
begin
  FHttp.Free;
  Fsmscursor := nil;
  inherited;
end;
 
procedure Register;
begin
//   might not be needed / replaced by IDE dialog
//  RegisterComponentsProcHandler('Data', [TDS]);
end;
 
end.
 
//SMARTMS CODE - MAIN FORM WITH BUTTON AND MEMO
unit Form1;
 
interface
 
uses 
  SmartCL.System, SmartCL.Graphics, SmartCL.Components, SmartCL.Forms,
  SmartCL.Inet,
  SmartCL.Fonts, SmartCL.Borders, SmartCL.Application,
  SmartCL.Controls.Button, DS, SmartCL.Controls.Memo;
 
type
  TForm1 = class(TW3Form)
    procedure W3Button1Click(Sender: TObject);
    DS1 : TDS;
  private
    {$I 'Form1:intf'}
  protected
    procedure InitializeForm; override;
    procedure InitializeObject; override;
    procedure Resize; override;
    procedure SQLRetrieved(Sender: TW3HttpRequest);
  end;
 
implementation
 
{ TForm1 }
 
procedure TForm1.W3Button1Click(Sender: TObject);
begin
  DS1.SQLString := 'SELECT * FROM employee';     //change to your query to get data from existing MySQL table
  DS1.FOnDataReady := SQLRetrieved;
  DS1.SQLSelect;
end;
 
procedure TForm1.SQLRetrieved(Sender: TW3HttpRequest);
begin
W3Memo1.Text:= Sender.ResponseText;
end;
 
 
procedure TForm1.InitializeForm;
begin
  inherited;
  // this is a good place to initialize components
  DS1 := TDS.Create(self);
end;
 
procedure TForm1.InitializeObject;
begin
  inherited;
  {$I 'Form1:impl'}
end;
 
procedure TForm1.Resize;
begin
  inherited;
end;
 
initialization
  Forms.RegisterForm({$I %FILE%}, TForm1);
end.

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this  

×
×
  • Create New...