Reading a file line by line in reverse with PHP

Here is a handy function for reading a file line by line in reverse (from the end of the file).

function rfgets($handle) {
    $line = null;
    $n = 0;

    if ($handle) {
        $line = '';

        $started = false;
        $gotline = false;

        while (!$gotline) {
            if (ftell($handle) == 0) {
                fseek($handle, -1, SEEK_END);
            } else {
                fseek($handle, -2, SEEK_CUR);
            }

            $readres = ($char = fgetc($handle));

            if (false === $readres) {
                $gotline = true;
            } elseif ($char == "\n" || $char == "\r") {
                if ($started)
                    $gotline = true;
                else
                    $started = true;
            } elseif ($started) {
                $line .= $char;
            }
        }
    }

    fseek($handle, 1, SEEK_CUR);

    return strrev($line);
}

$filename = 'top-1m.csv';

echo "Reverse reading $filename" . PHP_EOL;

$handle = @fopen($filename, 'r');

for ($i = 0; $i < 10; $i++) {
    $buffer = rfgets($handle);
    echo $buffer . PHP_EOL;
}

fclose($handle);

The output produced (from the Alexa top 1 million domains list) is:

[]

Reading random lines from a file with PHP

While developing a testing framework I decided it would be nice to use a random sample of records from Alexa’s Top 1 million domains list. Here is the function I wrote to read a random number of lines from the file.

function random_lines($filename, $numlines, $unique=true) {
    if (!file_exists($filename) || !is_readable($filename))
        return null;
    $filesize = filesize($filename);
    $lines = array();
    $n = 0;

    $handle = @fopen($filename, 'r');

    if ($handle) {
        while ($n < $numlines) {
            fseek($handle, rand(0, $filesize));

            $started = false;
            $gotline = false;
            $line = "";

            while (!$gotline) {
                if (false === ($char = fgetc($handle))) {
                    $gotline = true;
                } elseif ($char == "\n" || $char == "\r") {
                    if ($started)
                        $gotline = true;
                    else
                        $started = true;
                } elseif ($started) {
                    $line .= $char;
                }
            }

            if ($unique && array_search($line, $lines))
                continue;

            $n++;
            array_push($lines, $line);
        }

        fclose($handle);
    }

    return $lines;
}

// Example usage
$lines = random_lines('top-1m.csv', 100);
echo json_encode($lines) . PHP_EOL;

The output produced is:

[]

The Economist on Biometrics

[HT Bruce Schneier]

Here’s an excellent article on the use of biometrics in security system. Here are some highlights.

Intro

Authentication of a person is usually based on one of three things: something the person knows, such as a password; something physical the person possesses, like an actual key or token; or something about the person’s appearance or behaviour. Biometric authentication relies on the third approach. Its advantage is that, unlike a password or a token, it can work without active input from the user. That makes it both convenient and efficient: there is nothing to carry, forget or lose.

[]

Using Punycode to Access Non-Latin Domains

Using Punycode to Access Non-Latin Domains

The Internet Corporation on Assigned Names and Numbers recently decided to allow for the issuing of non-Latin domain names. Previously all countries were forced to use the ASCII character set, including countries whose native language included non-ASCII characters.

To aid in the transition, ICANN devised a micro-language of sorts to allow a smooth transition between ASCII-only domain names and the more robust Unicode domain names (which allows for non-Latin characters). This micro-language is known as Punycode.

[]

The foundation of the internet

In July 1945 Vannevar Bush published a paper titled " As We May Think" in the Atlantic monthly magazine. In this article Bush lays out a vision of the future wherein he hopes we will be able to “wield [the] record for true good”. By record he meant the sum total of human knowledge which has been recorded in a more permanent fashion and made easily attainable.

Here are some selections from Bush’s excellent paper that inspired the internet 65 years ago.

[]

Android Bible Flashcards

Bible Flashcards contains thousands of Greek and Hebrew flashcards to help you learn the alphabet and vocabulary words. Based off of the lessons provided by CrossWire Bible Society’s free FlashCards program.

Please note that in the 2.1.5 update, due to a change in how cards are stored and retrieved (They are no longer stored in escaped Unicode in case you are interested in the technical details. They are now stored in their final encoded form which should translate into faster render times.), you will need to clear your learned database through the preferences screen.

[]

Fred Brooks on the promise of object oriented programming

One view of object-oriented programming is that it is a discipline that enforces modularity and clean interfaces. A second view emphasizes encapsulation, the fact that one cannot see, much less design, the inner structure of the pieces. Another view emphasizes inheritance, with its concomitant hierarchical structure of classes, with virtual functions. Yet another view emphasizes strong abstract data-typing, with its assurance that a particular data-type will be manipulated only by operations proper to it.

[]

What to look for in a new computer

I’ve been asked by friends and family several times recently what to look for when looking for a new computer. Regardless whether you are looking for a laptop or desktop I believe there are a few guidelines that will help aid you in making your next computer purchase.

The key things to look for in a computer are:

Here is a good base search on laptops on Tigerdirect.com, and here is one for desktops.

[]

The state of automotive computer security

Automotive security came under fire recently when it was revealed that a flaw in the design of some brake systems could give a remote attacker access to the car’s internal network.

But what damage could an attacker really do once they gained access to a car’s internal security system? And how vulnerable are most of the cars on the roads? I mean, this story is little more than another piece of interesting trivia if we’re only talking about one or two models of a high end luxury car.

[]

Is there a such thing as a cyberwar?

Intelligence Squared held an interesting and thought provoking debate recently where the concept of cyberwar was addressed.

The central issue in the debate is this: Are we justified in calling any form of aggression carried out in a synthetic space such as the internet a “war”?

In my estimation Bruce Schneier brings up some very good points and concerns in his portion of the debate. Points and concerns that, as far as I could tell, were never really addressed by his opponents.

[]