Thursday, December 30, 2010

Nyelvi igénytelenség

Az NMHH nekiment a Tilosnak. Tele van vele az Internetz, a sajtó, nem is fűzök véleményt az ügyhöz.

Itt csak a Nemzeti Média- és Hírközlési Hatóság leveléből (mirror) idézek két résztletet. Az első a vitatott - stílszerűen "It's on" című - szám kezdete anglolul, a második pedig magyarul. Aki az NMHH-ban ezt a fordítást készítette, annak úgy tűnik, fogalma sem volt arról, hogy itt a kábítószer-kereskedelemről van nagyban szó. Bizonyára azt hitte, hogy minden a dalokban elhangzó trágár kifejezések körül forog.
Yo, Ice, the organization say they can't stay in business with us any longer. What you gonna do?
We always knew we were gonna come to this point sooner or later ... we have absolutely no option but to move forward. We'll have to set up our own distribution, manufacturing, run totally independent organization and operation. We still got our connections in Texas, Miami, New York, Chicago, Detroit and soldiers on the street wiling to die. I can't put any cut on the product.
NMHH magyar fordítása:
Yo, Ice, a szervezet azt mondja, hogy nem üzletelnek velünk többet. Mit tehetünk?
Mindig is tudtuk, hogy el fog jönni előbb vagy utább ez a pillanat, nincs más választásunk, megyünk előre. Létre fogjuk hozni a saját forgalmazó hálózatunkat, gyártásunkat, egy független céget. Még mindig megvannak a kapcsolataink Texasban, Miamiban, New Yorkban, Chicagoban, Detroitban. A katonáink hajlandóak meghalni értünk az utcán. Nem vághatok ki semmit a dalokból.

Monday, August 23, 2010

Convert CSV or TAB delimited text to SQL Insert with Python

Basic task: convert your coma separated or tab delimited txt file into an SQL insert script.
Done in a functional manner. Define your fields in the input file and in the destination table:  excel_fieldssql_fields 
Define your mapping: define a mapping between the two field list: mapping
Also you can assign a function to each element of this mapping. This function will be applied on the value of the input field to get the sanitized or derived db filed values in your insert SQL: map_func
The code that does the actual work is quite simple:
def print_insert(splittedline):
    print ''.join([insert_start, ','.join(map(lambda x : map_func[x](get(mapping[x],splittedline)), sql_fields)), insert_end])
f = open(sys.argv[1]'r')
map(print_insert, filter(filter_lines, map(lambda x : x.split(field_delimiter), f.readlines())))


Friday, May 21, 2010

BWV 846 Prelude No. 1 - Python

Suppose you have chords in this format: "C3E3G3C4E4" (this is just the beginning chord, the numbers are indicating transposition). Here's some python code to interpret chords in this format.

Sunday, May 16, 2010

Date range search

I've created a date range search webapp for a friend. The source is open. It was used as a data collection tool for his/her mid term essay. You can define a search term and a site and a date-range and for each day in the range you'll get the number of occurances of the search term on the given site. The whole thing is very simple. It uses Google's date range search and it always searches on a one day interval (from=to). So it iterates over the whole range given and returns the result count for each day. For example let's see the distribution of the term "rendszerváltás" on the Népszabadság online within this year.
You can use this to collect data for neat charts like this one:



Friday, February 26, 2010

Google personalized search

My blogspot blog was served as the first result of a personalized search for my friend — this is "personalized search", an opt-in feature that uses someone’s search history and location as signals to determine what kind of results they’ll find useful. Or as search quality engineer Patrick Riley puts it: every time you search on Google, you’re a lab rat.
The search terms were about a certain song and my blog post was totally irrelevant result. And it's pagerank is low. Only imaginable reason for my blog being the first is our gmail correspondence. How on earth could that be the first? Any idea?
screenshot:

Wednesday, February 24, 2010

Ocr Example Ocropus

Update: Here is Google Docs OCR output for the same image.
On an Ubuntu it takes a cc. 20 minutes to check out Ocropus and build from source and give it a try on an example image. Here are the results.




1 Down the Rabbit-Hole
Alice was beginning to get very tired of sitting by her sister on the bank,
and of having nothing to do: once or twice she had peeped into the book her
sister was reading, but it had no pictures Or cOnversations in it, 'and what is
the use of a book, ' thought Alice 'without pictures or conversation7
So She was considering in her own mind (as well as she could, for the hot
day made her feel very sleepy and stupid) , whether the pleasure of making a
daisy-chain would be worth the trouble of getting up and picking the daisies,
when suddenly a White Rabbit with pink eyes ran close by her.
There was nothing so VERY remarkable in that ; nor did Alice think it so
VERY much out of the way to hear the Rabbit say to itself, 'Oh dearl Oh
dearl I shall be latel ' (when she thought it over afterwards, it occurred to
her that she ought to have wondered at this, but at the time it all seemed
quite natural) ; but when the Rabbit actually TOOK A WATCH OUT OF
ITS WAISTCOAT- POCKET, and looked at it, and then hurried on, Alice
started to her feet , for it flashed across her mind that she had never before
seen a rabbit with either a waistcoat-pocket , or a watch to take out of it, and
burning with curiosity, she ran across the fleld after it, and fortunately was
just in time to see it pop down a large rabbit-hole under the hedge.
In another moment down went Alice after it, never once considering how
in the world she was to get out again.
The rabbit-hole went straight on like a tunnel for some way, and then
dipped suddenly down, so suddenly that Alice had not a moment to think
about stopping herself before she found herself falling down a very deep well.
Either the well was very deep , or she fell very slowly, for she had plenty
of time as she went down to look about her and to wonder what was going
to happen next . First , she tried to look down and make out what she was
coming to, but it was too dark to see anything; then she looked at the sides
of the well, and noticed that they were flled with cupboards and book-
shelves; here and there she saw maps and pictures hung upon pegs. She took
down a j ar from one of the shelves as she passed; it was labelled ' ORANGE
MARMALADE' , but to her great disappointment it was empty: she did not
like to drop the jar for fear of killing somebody, so managed to put it into
one of the cupboards as she fell past it.
'Welll ' thought Alice to herself, 'after such a fall as this, I shall think
nothing of tumbling down stairsl How brave they'll all think me at homel
Why, I wouldn't say anything about it, even if I fell off the top of the housel '
--- --- ---
Update: Here is Google Docs OCR output for the same image.

Tuesday, February 23, 2010

Compressed Sensing Image Reconstruction example

Recently learned about compressed sensing - my mind has just been blown away.
Here I share you a few example images to demonstrate what is possible by using advanced applied math algorithms and also a few links to dive into the subject. The pictures above are the original images and the next pictures bellow are the reconstructed versions by the magic algorithms. These samples were taken from Sapiro's paper.
There's an interesting article on the subject in Wired mag and here's a good glossary page.







Friday, February 19, 2010

Anagrammák megint

Ács eb ingerel, 
Generál csibe. 
Bence le is rág. 
Geri csáb el: ne! 
Bengáli csere, 
berág, elcseni. 
Bele! Igen, srác! 
Náci les, rebeg: 
láger becse, ni! 
Cár bige-lesen, 
geciben les rá. 
Ági cserben el, 
Egerbe csinál. 
Írta Gerinces Ábel, Csengeri Ábel, Berceli Ágnes és Rábi Csengele

Friday, February 12, 2010

Sztárolt eljárás

Map reduce az Oracle-ön belül, ez most a sztárolt eljárás.