The #1 result in Google for stripping characters out of strings in PHP is awful and uses the deprecated ereg_replace function so lets make a new search result using preg_replace which is much better, faster and it is fully supported in PHP 5.3 and 5.4.
Allow only alphanumeric:
$out = preg_replace('|[^A-Za-z0-9]|', '', $in);
$out = preg_replace('|[^0-9]|', '', $in);
Alphanumeric with whitespace:
$out = preg_replace('|[^A-Za-z0-9\s]|', '', $in);
The ^ means match everything that is not listed, so you just list anything you need like symbols, numbers and letters and it will match everything else and replace it with nothing leaving you with a nice clean string. You can use this to filter stuff like hexadecimal, base64, postcodes, or just to force plain text.
We’ve recently been provided with some really high aerial photography of various places such as Armidale, Batemans Bay, Broulee, Forster and Narooma.
It goes down from 10 to 6 cm resolution on the ground which is 8x better than Google’s satellite imagery in those areas.
It chews up over 100gig at the moment and more areas are being flown. Had to drive a portable hard drive of data to the data center to move it all. When you can very clearly see individual wires on a power line you know its good footage.
Well usually Google’s maximum crawl rate in Google Webmaster Tools is limited to 0.5 requests/second as the fastest rate possible with the slider.
The growth of PropertyNow however has caused that speed to increase – first to 0.8 requests/second then to 1.5 requests/second then even faster still to 2.5 requests/second.
That is the fastest I’ve ever heard of. There has been a increase in quality content, and also the server can handle the speed, so naturally Google wants to crawl it all as soon as possible and refresh it as often as possible. The logs show that they do actually push that limit but in bursts rather than constantly.
This change has also coincided with alterations of the search results. All keywords dropped temporarily during the speedy crawl period.
Very interesting stuff indeed.
PropertyNow Real Estate has just recently opened its doors to agents and its done so in a big way. A very large number of agents have already signed up and more are coming.
So much so, Google has taken a interest. Googlebot has been frantically crawling the past couple of days, and the Crawl Rate settings have changed as well. Usually you cant ask Googlebot to go any faster than 0.5 requests/s but it is now letting me select up to 1.25 requests/sec or 0.8 seconds between requests! I’ve never seen that behaviour before.
We’ll have to see if that is reflected by a improvement in the search results. Fingers crossed.
4chan has been on a rampage against any anti-piracy groups who annoy them and the list of casualties is pretty long.
Yesterday they attacked AFACT (Australian Federation Against Copyright Theft) and managed not only to take their site out, but they completely flattened NetRegistry who was their host.
On Whirlpool, NetRegistry is now being slammed for hosting them. It sounds like they will lose a bit of credibility after this one. Not only because some people are sympathetic towards 4chan’s cause, but also because NetRegistry willingly hosted a high risk site right next to everyone else’s website.
You really see a host’s true colours after a incident after this. A NetRegistry rep, Angelina Potapova, isnt handling the criticism very well. She’s basically said that anyone who criticises them must be one of the attackers which isnt a smart move when they are your customers or potential customers. She also incorrectly credited the attack to The Pirate Bay when it was 4chan who is completely unrelated.
As someone who pays for hosting through a provider, if they were hosting a high value target such as AFACT anywhere near my hosting, I’d be looking very closely at my SLA and I’d also look for a new host. Keeping them on the same infrastructure as everyone else is horribly stupid. Not that it would have mattered if NetRegistry separated the site because the DDoS flattened their routers as well from the sound of it. They completely went offline for a good hour or two and everything was sluggish for quite awhile later.
I sure hope they are making AFACT pay for breaking everyone’s SLA…..thats if they have one. I couldnt find theirs which isn’t a good sign for their customers.
By the way, yes AFACT is the group that has been suing iiNet for not breaking the law and giving AFACT personal details on subscribers so my sympathy is limited.
Well I’ve finally gotten the Luxury Homes Australia Blog going which has got some pretty cool things. We send them out as a newsletter every week or so.
I’ve also managed to finally find a very tall lava lamp. I’ve been looking for one for ages but its almost as if they dont exist. Its nice and tall with a metal stand which supports it. Only come in blue and red with clear liquid but beggars cant be choosers. $40 made it a quick sell.
Well here I am at 1:50am doing server watching duty.
Why? It is the release of Cornelia Funke’s new book called Reckless, and the server I’m watching is the official website. If anything goes wrong then I need to scramble around and fix it. I didnt create the website, but I’ve been tasked with making sure it can scale to stand up to the barrage of visitors from the Official New York premier of the book. Every single book has the URL in it so its not a small feat.
It should go well.
Few more hours and then the worst will be over and I can get a little bit of sleep before real work tomorrow.
Believe it or not, but I found it cheaper to get a second internet connection (TPG 512k 10gig limit) dedicated to VoIP, than to continue paying Telstra.
I mostly only call Queensland from New South Wales but the bill gets up to $70 to $80 a month. The main problem with VoIP is someone might call while I’m uploading a large file or doing other work. The solution to that is a cheap slow net connection solely for VoIP.
I’m saving $20 a month, and the extra equipment (VoIP ATA + cheap used router) and setup fees will be paid off in 6 months. After that its just lower bills.
Telstra simply cannot compete with 10c untimed national calls.
Well I need to redo a property handling system from scratch. Code to move properties from one site to another.
Not only does it need to do that, but it also needs to only send what has changed, and it needs to do it very quickly.
So that means I’ll need a complex forking PHP daemon, one master to set tasks and then some slaves to focus on specific types of property transmission (e.g. REAXML) and then some more slaves for just minor tasks which can be run in parallel. E.g. fetching images.
All the communication will be via Gearman. Its a lot easier that way because Gearman handles queuing and can also do priorities.
Most methods of moving properties around is just XML deltas where a element is only specified if it has changed. When a new property is made then the full data is sent. There are hacks to emulate that, but I will be going will a full history system where every single change is stored, and if required can be replayed.
The main thing I’m not looking forward to doing is working on the code which makes sure that all the separate slaves are still running.
It needs to be able to reset its self if anything goes wrong, and log errors. Not easy stuff.
I’ll probably create some kind of heatbeat system.
I’ve been extremely busy these days.
One of my new projects is a Interactive Floor Plan builder for real estate agents, so they can do it themselves and save a whole pile of money. Currently firms which do them at the moment charge about $70 for them.
Real estate agents also have horrible email addresses. For some reason @bigpond.com is the most common by far.
So I’ve also been setting up a really professional email hosting server (anti-spam, anti-virus, very fast, IMAP, POP3, SMTP) to try and get rid of some of those addresses.
We’ll see where both of those go.