How to Stealing Website Data?

This article purposed to explain more details about how to grab website data from PHP. Since it’s look a bit difficult to understand for PHP beginner to create a pattern recognition string with preg_match_all function, so I tried to re-made (from my previous article) a simplest sample script which able to “steal” the US dollar currency & current date from the website of Indonesia Monetary Ministry (Departemen Keuangan). Furthermore, you can process the data for other objectives such as create a daily line graph or something useful. As you may find million same topics from search engine talking about grabbing US currency or others similar data capturing technique, now I’d like to show you a simplest one – a dumbest way, I mean.

Here below is our main data source located at http://www.depkeu.go.id/Ind/Currency/ . There’s a table page containing foreign currencies data, but remember: our objective is capturing the US dollar & date only.



Let’s check current source from your browser & try to get an exact code referring to the US currency just like a similar code below:


<table cellpadding=0 cellspacing=0 border=0 width=500 align=center class=tabQuadTop>
<tr><th width=5% class=tabQuadBottom>#</th>
<th width=18% class=tabQuadBottom>Mata Uang</th>
<th class=tabQuadBottom>Negara</th>
<th width=18% class=tabQuadBottom>Rp</th>
<th width=13% class=tabQuadBottom>Dev (Rp)</th><th width=13% class=tabQuadBottom>Dev (%)</th></tr>
<tr class=BoardBody><td class=tabQuadBottom align=right>1</td>
<td class=tabQuadBottom>&nbsp;1 USD</td>
<td class=tabQuadBottom>Amerika Serikat</td>
<td class=tabQuadBottom align=right>9,069.20<img src=../../Images/Down.gif border=0></td>
<td class=tabQuadBottom align=right>-99.80</td>
<td class=tabQuadBottom align=right>-1.088</td></tr>


Look, there’s 2 datas to captured, first is 9,206.20 & the current date 03-03-2008. Got it? The date string data located after tanggal string. & the currency value located precisely after the td class=tabQuadBottom align=right code. Based on current pattern, we can create a formula code with preg_match_all function. See below code:

<?
$data = implode('', file("http://www.depkeu.go.id/Ind/Currency/"));

preg_match_all("/tanggal (.*)\./", $data, $hasil);
$tgl_update=$hasil[1][0];
$tgl_update=susunTgl($tgl_update);
echo $tgl_update."\n";

preg_match_all("/<td class=tabQuadBottom>Amerika Serikat<\/td><td class=tabQuadBottom align=right>(.*)<img src/", $data, $hasil);
$hrg_dolar=$hasil[1][0];
$hrg_dolar=substr($hrg_dolar,0,5);
$hrg_dolar=str_replace(',','',$hrg_dolar);
echo $hrg_dolar."\n";
?>


Next, save the code on your web server & execute it from browser (make sure that you have to connect to the internet first before executing the code). See that? The browser will display only the date & US currency from 2 variable $tgl_update & $hrg_dolar in MySQL format. Bundle it to your own database function & now you can save the result directly to MySQL. To set up it executed automatically every single day, you may add the code functionality with crontab function in Linux server using php –f <path_to_file> command parameter.

For other great result, you may also joining the current code with JavaScript to create a bar series showing the daily US currency just like picture below:



Now, for your own exercise, take a tour to http://www.pegadaian.co.id. Go there & grab the active date & daily gold currency (Kurs Harga Emas) located like the picture shown below:



Make a similar daily bar graph series & show it on your …uhmm… jewelry store online website ~ just in case if you have one ha ha ha…



Have a nice homework & show me what you got.

Labels:


PS: If you've benefit from this blog,
you can support it by making a small contribution.

Enter your email address to receive feed update from this blog:

Post a Comment

 

  1. Anonymous Anonymous said,

    Thursday, April 03, 2008 11:03:00 AM

    untuk memperoleh data mengunakan implode seperti nya untuk intranet, apakah bisa digunakan untuk web hosting, karena saya mencoba script tersebut dan saya hosting, tp tidak muncul data hasil grab www.klikbca.com

    mohon informasinya

  2. Blogger Eko Wahyudiharto said,

    Thursday, April 03, 2008 1:24:00 PM

    Fungsi grabbing pada dasarnya menggunakan teknik pembacaan file (fopen() pada C) dengan memakai fungsi native PHP file().

    Permasalahannya, tidak semua webhosting memiliki kebijakan untuk membuka semua akses & fungsi yang ada pada PHP. Seperti contoh referensi berikut ini yang menyatakan bahwa fungsi implode() sangat membahayakan sekuriti para user dalam sebuah webhosting.

    Kendali fungsi implode() dan file() yang terdapat dalam cuplikan skrip di atas mewajibkan parameter fopen_wrappers pada php.ini harus di aktifkan. Perhatikan dokumentasi manual salah satu dari situs resmi PHP tentang hal ini.

    Nah, kesimpulannya.. parameter fopen_wrappers di tempat webhosting situ ndak diaktifkan (untuk alasan keamanan).

  3. Anonymous Anonymous said,

    Friday, April 11, 2008 8:48:00 AM

    Kali ini mengapa tidak Anda tambahkan: 'mangkane... pilihlah hosting yang baik & benar hue... he... he... he... he...'?
    Diskrimitatif tenan kowe!!!!

Post a Comment

Leave comments here...