You are currently browsing the category archive for the 'Python' category.
What is a scripting language?
From Wikipedia we read that:
Scripting languages (commonly called scripting programming languages or script languages) are computer programming languages that are typically interpreted and can be typed directly from a keyboard. Thus, scripts are often distinguished from programs, because programs are converted permanently into binary executable files (i.e., zeros and ones) before they are run. Scripts remain in their original form and are interpreted command-by-command each time they are run. Scripts were created to shorten the traditional edit-compile-link-run process.
Of course any self respecting computer scientist knows that definition is meaningless, because a language is a mathematical construct
and is not tied to its implementation. You can write a C interpreter, for example (and there are some). You can also write a compiler for any ‘interpreted language’ (of course the term ‘interpreted language is also meaningless). And there are theoretical constructs (Futamura projections) that given any interpreter are able to build a compiler from it (you could implement them even in practice, though it would be terribly inefficient).
However, any other definition would be misleading as well: if you define a scripting language as a language you can use to “script” (as in “bash script”), then you’ve got definition that’s even more stupid. bash is a scripting language, and so it is Python. However, if someone writes an application that is scripted in C, C becomes a ’scripting language’).
Both definitions have an additional problem: scripting languages are considered somewhat inferior to ‘non-scripting languages’. Some even use the term ‘true programming languages’ (and this ‘true programming languages’ are usually considered to be C and C++, for example).
Lisp, Haskell, OCaml, Prolog
In this dissertation on ’scripting languages’, Lisp is quite emblematic. Lisp can be compiled (there are some Lisp compilers around). But Lisp can also be interpreted. This is one of the most
widespread Lisp environments (and it is an interpreter). Moreover, in Emacs you can just run some Lisp snippets. And Emacs itself is scripted in Lisp (thus Lisp would be a scripting language according to the second definition). However, no one can assert that it is not a ‘true’ programming language. Emacs itself is a beautiful (and complex) piece of software, and a lot of ‘real’ (and extremely difficult) stuff in the AI field has been written in Lisp.
Haskell can be another example. Hugs is an interpreter, ghc is both an interpreter and a compiler. And Haskell is even statically typed (so you see, being dynamic is
not necessary ). The very same applies for OCaml or Prolog. And all this
languages have been used by the most theory-oriented computer scientist to
implement real programs.
I suppose nobody would say that Lisp or Haskell are ’scripting languages’ (even if according to the definitions they are — Haskell is used in many configuration tools by Linspire programmers).
Bash, Perl, Python, Ruby.
All this languages have been labeled as ’scripting languages’. And in fact according to definitions, they are. However, they are very different languages.
Bash is a very simple language designed to do simple things. Unix system administrators used it intensively (along with awk and sed and other utilities) to automate many repetitive tasks. Although this is surely ‘programming’, in some sense it’s different from programming an application from scratch. Some even consider it a ‘lower level’ kind of programming (and surely we must recognize that skills involved in writing a boot script are different from skills used to write a compiler)
Perl was born in this environment: its goal was textual manipulations and interactions between processes in a system. It is a nicer alternative to bash + awk in many situations. Later it has been added a fully fledged object model, but in the while it has been used to write very complex applications that go outside the ‘lower level programming domain’ that is usually associated with ’scripting’. Even though I don’t like Perl anymore, it is a general purpose language, with the same dignity of C or C++.
The ’scripting language’ monicker is not fitting anymore. And it is even less fitting for Python. Python has been an object oriented programming language from the beginning. Good software engineering practices have always been employed. However, many consider it just another scripting language. And of course Python makes a great scripting language.
The domains in which Python excels are the same in which Java is used, for example. However, Python can be used for scripting, too. That is to say… some (and I’d be tempted to say ‘most’ ) ’scripting languages’ can be used both as general purpose languages and automation/scripting.
The reason why they can be used for automation/scripting and that they are easier to use and we can write programs faster. The development cycle is faster, too. And to me, they all seem good properties.
I think that the very same features that make Python (or Ruby) great for scripting, make it even greater for general purpose programming. That is to say we should consider a ’scripting language’ not a poorer language, but a better one (yeah, of course this is exaggerated, since there are languages suited for scripting that are not suited for general purpose programming — AppleScript and bash, for example).
In Python Metaclasses are considered an advanced and difficult to use technique. This is certainly true. Developers of ‘production code’ are often discouraged to use metaclasses, since code that makes use of metaclasses is often overly complex (especially for people who are not used to metaclasses).
A further reason to avoid metaclasses is that they truly can be avoided. They do not add much power to Python. However, there are cases when they make ‘client’ code more readable.
Consider this situation: we have a finite set of strings that are used as dictionary keys in several parts of our application. Of course, we don’t want to use explicit strings in every point where they are needed. This is too error prone and bugs are more difficult to track.
A good solution is to put all the strings in a module. We will have code such as
FOO1 = 'foo1' FOO2 = 'foo2'
Good. However, we may want to loop over the constants. One of the more pythonic solutions is this one:
targets =
.
In this way we can do whatever we want. However, we could use a class instead of a module. We simply define the constants as class variables. Then we can easily use the locals built-in to create the array of constants. But with a little metaclass trick, we can write more interesting client code.
We could for example write things like:
for k in PackageKey:
# do something
or
if key in PackageKey:
return
We can use metaclasses to define a class object that behaves in the way above.
for v in cls.keys:
yield v
__metaclass__ = _ConstantNamespace
NAME = ‘name‘
VERSION = ‘version‘
REVISION = ‘revision‘
DIRECTORY = ‘directory‘
VARIANTS = ‘variants‘
HOMEPAGE = ‘homepage‘
DESCRIPTION = ‘description‘
BUILD_DEPENDENCIES = ‘build_dependencies‘
LIBRARY_DEPENDENCIES = ‘library_dependencies‘
RUNTIME_DEPENDENCIES = ‘runtime_dependencies‘
PLATFORMS = ‘platforms‘
MAINTAINERS = ‘maintainers‘
keys =
In tempi di Duck Typing ridiventa certamente di moda la distinzione fra classi e tipi che si trova per esempio nel capitolo introduttivo di Design Patterns di Gamma, Helm, Johnson e Vlissides.
La maggior parte delle persone sono oggi abituate a linguaggi come Java o C++ dove classi e tipi in buona sostanza coincidono (con la possibile eccezione della programmazione generica).
Un tipo è il nome usato per denotare una particolare interfaccia. Questo non ha direttamente a che vedere con le Interfaces del Java, per inciso. L’interfaccia è l’insieme di metodi (o di messaggi, per dirla con Ruby, Smalltalk o ObjectiveC) cui un oggetto risponde.
Tuttavia un oggetto può implementare diversi tipi, e tipi diversi possono essere condivisi da oggetti molto diversi fra loro. In questo i tipi si comportano come le interfacce Java. Qualunque programmatore Java sa che ci sono Interfacce (uso il maiuscolo per indicare che intendo la parola nel senso “Javesco”) supportate da diversi oggetti (leggi Cloneable, per esempio) e che un qualunque oggetto può implementare un numero grande a piacere di Interfacce.
La differenza è che le Interfacce di Java sono statiche e “legate” alle classi. Una data classe dice di implementare un certo numero di interfacce. Le istanze di quella classe avranno quindi per tipo le varie interfacce.
In Ruby tuttavia questo non accade. Un oggetto ha un dato tipo perché risponde a determinati messaggi, ma non specifichiamo da nessuna parte quale sia il tipo. Per il solo fatto di rispondere a certi messaggi un oggetto è di un dato tipo (a prescindere dalla sua classe). I tipi sono “informali” (usando la stessa differenza fra protocolli formali e informali di ObjectiveC — i protocolli formali si comportano sostanzialmente in modo simile alle Interfacce di Java, quelli informali sono appunti “non staticamente tipizzati”).
La classe è invece legata direttamente all’implementazione. Citando DP del GOF
“È importante capire la differenza fra la classe di un oggetto e il suo tipo. La classe definisce come è stato implementato. Definisce il suo stato interno e l’implementazione delle sue operazioni. D’altra parte il tipo di un oggetto si riferisce solo alla sua interfaccia, all’insieme di richieste a cui può rispondere. Un oggetto può avere molti tipi e oggetti di classi differenti possono avere lo stesso tipo
Naturalmente c’è una una stretta parentela fra classe e tipo. Poiché una classe definisce le operazioni che un oggetto è in grado di eseguire, definisce anche il suo tipo. Quando diciamo che un oggetto è un’istanza di una classe, diciamo implicitamente che l’oggetto supporta l’interfaccia definita dalla classe.
Linguaggi come C++ e Eifell (e Java ndt) usano le classi per specificare
sia il tipo di un oggetto che la sua implementazione”
D’altra parte linguaggi dinamici come Ruby o Python (o Smalltalk) non dichiarano i tipi delle variabili. Mandano messaggi (o chiamano metodi, per dirla con Python) agli oggetti denotati dalle variabili e se tali oggetti supportano il messaggio, tutto funzionerà.
La differenza fra ereditarietà di classe e ereditarietà di tipo è importante. L’ereditarietà di classe riguarda definire l’implementazione di un oggetto attraverso l’implementazione di un altro oggetto. È senza dubbio un concetto “DRY”. Se ho già definito delle cose (e gli oggetti sono sufficientemente vicini) posso mantenerle uguali. In Ruby a questo si associano i mixin che permettono di condividere codice *senza* andare in ereditarietà (ma questa è un’altra storia), in Python si può usare l’ereditarietà multipla per emulare i mixin.
L’ereditarietà di tipo è invece relativa all’utilizzare un oggetto al posto di un altro. In questo senso siamo più vicini al Principio di Sostituzione di Barbara Liskov. Un ottimo articolo relativo al LSP si trova qui . In buona sostanza possiamo enunciare il Principio di Sostituzione di Liskov come:
T1 è un sottotipo di T2 se per ogni oggetto O1 di tipo T1 esiste un oggetto O2 di tipo T2 tale che per ogni programma definito in termini di T1 il comportamento del programma è invariato sostituendo O2 al posto di O1.
Confondere ereditarietà di tipo e di classe è facile. Molti linguaggi non hanno alcuna distinzione fra le due. Anzi, vengono usate le classi per definire i tipi. Molti programmatori per esempio vedono il LSP solo in relazione alle sottoclassi (in effetti in C++ è importante fare in modo tale che gli oggetti si comportino “bene” in relazione al LSP, in quanto è sempre possibile fare puntare un puntatore o una reference di un supertipo ad un oggetto del suo sottotipo).
In realtà il principio di Liskov è una cosa *molto* severa. Due oggetti sono sostituibili secondo Liskov se davvero (limitatamente al tipo comune) si comportano allo stesso modo, e viene naturale pensarli come classe - sottoclasse (anche se effettivamente questo *non* è necessario).
Il DuckTyping è meno severo: non chiede che il programma abbia lo stesso comportamento. Due oggetti sono appunto sostituibili se rispondono alle stesse cose, anche se rispondono in modo diverso. Non ha *nulla* a che fare con il Design By Contract: è molto più vicino a quello che in C++ si fa con i templates (che alcuni in effetti vedono come il corrispettivo statico del Duck Typing).
Tornando a noi, se un oggetto si comporta secondo un certo tipo (ovvero risponde ai metodi propri di quel tipo), allora trattiamolo come tale. Da cui: se si comporta come una papera, trattiamolo come una papera (Duck Typing, appunto).
Un linguaggio come Ruby rende *molto* problematico considerare come stretta la relazione fra tipi e classi. In ogni punto del programma (anche se non è sempre buona pratica farlo) possiamo cambiare il tipo di tutti gli oggetti di una data classe (aggiungendo o togliendo metodi alla stessa), o singolarmente ad un singolo oggetto. Sintomatico è in questo senso avere deprecato il metodo Object#type in favore di Object#class.
The quick-sort algorithm
Quick-sort is a sorting algorithm developed by Hoare that in average does O(n log n). The O() notation means that if the list length is n, the number of operations used to sort the list is in the order of magnitudo of n log n (in this case).
To be more formal, we shall say that there exist two constants c1 and c2 such that the number of operations m is
c1 * n * log n <= m <= c2 * n * log n. Notice that c1 and c2 are not specified in the O notation
We stop with math here. If you want more precise informations about the O notation read the wikipedia.
Now I’m going to briefly explain the quick-sort algorithm, and then we will use that algorithm to exploit the expressive power of some programming languages. Keep in mind that more expressive does not necessarily mean faster to run. It means faster to write. But being faster to write means that we can try more efficient algorithms, thus improving performance in a more effective way.
Expressive languages are usually high-level. In fact we can also profile the code, track the bottleneck and rewrite only a couple of functions in C or in ASM. This lead to fast programs with fast development cycles.
Quick-sort strategy
Quick-sort uses a divide and conquer approach. We divide a list in two sublists using a pivot element. All the elements less than the pivot come before it, those greater after. Then we recursively apply quick-sort to the two sublists.
Notice that the pivot is chosen arbitrarily. Some implementations chose the first element of the list. Some chose it randomly, some use other criteria.
Moreover there are implementations of quick-sort that sort the list in place (that is to say memory used is bounded), some need extra storage. Guess which are the more efficient.
On wikipedia you can find more informations about the quick-sort algorithm. You can find also some implementations, and a link to a page full of implementations in different languages.
I almost did not write code. I took code from the wikipedia or around the web and I’m going to comment it. This is the “new” work I did. Comments, explanations, not code. I also wrote a couple of implementations, however I don’t remember which ones.
Python
2.4.2 (#1, Mar 23 2006, 22:00:44)
[GCC 4.0.1 (Apple Computer, Inc. build 5250)]
Now lets examine a Python high level implementation.
def qsort(L):
if L == []: return []
return qsort([x for x in L[1:] if x< L[0]]) + L[0:1] + \
qsort([x for x in L[1:] if x>=L[0]])
This is the exact description of the quick-sort algorithm, if you know what Python list comprehension are. When I write [x for x in L if p(x)], Python computer the list of elements of the L container that make p(x) true.
L[a:b] is the sublist of L [L[a], L[a+1], …, L[b-1], L[b]]. L[a:]==L[a:len(L)], L[:b]==L[0:b]
[x for x in L[1:] if x< L[0]] is the sublist of the elements of L a part from the first one (that is the pivot) such that x is less than the pivot (L[0]).
[x for x in L[1:] if x>=L[0]] is the sublist of the elements of L a part from the first one (that is the pivot) such that x is greater or equal than the pivot (L[0]).
Then we pass those sublists to qsort that recursively sorts them, and eventually we return a append all those sublists and return them.
This implementation, although beautiful to read and understand is horrible from a performance point of view. It creates, concatenates and destroys a large number of lists. Moreover it is purely recursive. For large lists, we can even exhaust the stack space.
A version of quick-sort in Python that has no need for more memory is this
def partition(array, begin, end, cmp):
while begin < end:
while begin < end:
if cmp(array[begin], array[end]):
(array[begin], array[end]) = (array[end], array[begin])
break
end -= 1
while begin < end:
if cmp(array[begin], array[end]):
(array[begin], array[end]) = (array[end], array[begin])
break
begin += 1
return begin
def sort(array, cmp=lambda x, y: x > y, begin=None, end=None):
if begin is None: begin = 0
if end is None: end = len(array)
if begin < end:
i = partition(array, begin, end-1, cmp)
sort(array, cmp, begin, i)
sort(array, cmp, i+1, end)
If you want to run this code, remember that in Python indentation matters. If you just copy and paste the code, you may end with unusable code.
Anyway, I benchmarked all of them. “Elegant quicksort” is the first implementation. “Standard quicksort” is the latter. “Bultin sort” is standard python sort algorithm (that is not a quicksort, but a dedicated sorting algorithm).
Elegant qsort: [21.20, 19.75, 20.08, 20.46]
Builtin qsort: [1.61, 2.29, 2.29, 2.29]
Standard qsort: [31.15, 30.60, 30.68, 30.38]
Probably due to heavy optimization performed in list comprehension the “elegant” algorithm is faster than the standard one. However both of them are more than ten times slower than the python built-in sort algorithm.
We have learnt:
- High level languages are great to express algorithms in a readable way (and we can use them to test them and see how they work and all)
- You should not implement “basic” algorithms in high level languages: use their built-ins or write them in C (most built-ins are written in C).
C
Now it is time for a couple of pure C quick-sort.
void qsort1(int a[], int lo, int hi ){
int h, l, p, t;
if (lo < hi) {
l = lo;
h = hi;
p = a[hi];
do {
while ((l < h) && (a[l] <= p))
l = l+1;
while ((h > l) && (a[h] >= p))
h = h-1;
if (l < h) {
t = a[l];
a[l] = a[h];
a[h] = t;
}
} while (l < h);
t = a[l];
a[l] = a[hi];
a[hi] = t;
qsort( a, lo, l-1 );
qsort( a, l+1, hi );
}
}
Another implementation is
void quicksort(int x[], int first, int last) {
int pivIndex = 0;
if(first < last) {
pivIndex = partition(x,first, last);
quicksort(x,first,(pivIndex-1));
quicksort(x,(pivIndex+1),last);
}
}
int partition(int y[], int f, int l) {
int up,down,temp;
int piv = y[f];
up = f;
down = l;
goto partLS;
do {
temp = y[up];
y[up] = y[down];
y[down] = temp;
partLS:
while (y[up] <= piv && up < l) {
up++;
}
while (y[down] > piv && down > f ) {
down–;
}
} while (down > up);
y[f] = y[down];
y[down] = piv;
return down;
}
C also have a builtin quick-sort, that is much more advanced than this ones, since it allows to specify a comparison function, thus working with all kinds of data types.
Benchmarking these was hard. They too fast to be benchmarked with a list of a million elements and the standard posix clock function. However if you want to run the bench yourself, I could not use MacOS X specific bench functions.
So I used a list of 100000000 elements.
This is the (ugly) code I used for benchmarking. The test is not really interesting. Of course since C integers are machine integers it is dramatically faster than other solutions. It could have been more interesting if we sorted other data structures, where the gap could be reduced.
#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include "qsort1.h"
#include "qsort2.h"
#define LEN 100000000
int intcmp(const void *a, const void *b){
return *(int*)a - *(int*)b;
}
int main(){
size_t index;
int *l = malloc(sizeof(int)*LEN);
clock_t last, current;
srand(time(NULL));
for(index = 0; index < LEN; ++index){
l[index] = rand();
}
last = clock();
qsort(l,LEN, sizeof(int), intcmp);
current = clock();
printf("C qsort: %f\n", (current-last)/(float)CLOCKS_PER_SEC);
for(index = 0; index < LEN; ++index){
l[index] = rand();
}
last = clock();
qsort1(l,0, LEN-1);
current = clock();
printf("int qsort: %f\n", (current-last)/(float)CLOCKS_PER_SEC);
for(index = 0; index < LEN; ++index){
l[index] = rand();
}
last = clock();
quicksort(l,0, LEN-1);
current = clock();
printf("qsort 2: %f\n", (current-last)/(float)CLOCKS_PER_SEC);
free(l);
return 0;
}
Until a few days ago, I considered myself a “faithful” Pythonist. I liked ObjectiveC. I liked a bunch of other languages (notably Prolog or Haskell). I quite liked C++ too. But in fact my true love was Python.
I know it is strange to talk about “love”. That is of course unproper. Unfortunately I’ve no time to find a better word to express the thing. Let say that I liked to code in Python independently of what I was coding. Quite the same thing than I love using MacOS independently of the task, while I may be forced to use Windows because of the task, but would not use it if it was up to me (few… complicated period).
Today I read this (pointles) discussion about “self” in Python. There are people who
- Does not like to self.foo
- Does not like to
def foo(self, else): # code
I perfectly understand this position. In fact I do not really like having to reference self explicitly everytime (even if I fully understand why python does this). Of course it makes sense.
I’m calling methods /accessing variables of an object. So I should use the conventional way object.method|variable. There should be only one way to do it. For example Java “optional”this sucks. At least in my opinion. It has no purpose in method/variable stuff. Some use it to make clear that they’re referencing instance variables, some use a m_var notation.
If you are a C++ programmer you could be using var_ or (if you haven’t read the standard quite recently) _var
Of course having a clear and readable way to distinguish instance variables methods is good. That is clear. It makes you easier to read.
self is boring. I often forgot it and got error messages (about the ninth consecutive programming hour this is not the worst thing I do, however). In this sense I also forget the ruby @. And it’s better a spectacular error than a hidden bug. So… go on
I quite don’t like having specify self among the formal parameters. You schiznick, don’t you know its a method? Actualy Python does not. If you take a normal function and bind it to an object, the first formal parameter is the object itself. So its better that function was meant from the beginning that way and had a starting self parameter.
Of course this is boring, but its necessary and has not real disadvantages (a part from some additional typing.. something that with a decent editor is not a concern Aquamacs/Emacs or TextMate strongly advised).
And so all the knots get back to the comb. Python has this boring self everywhere. And it is there and should be there. Ruby hasn’t.
The @ makes the code quite readable (expecially with a decent editor). Of course it prevents name clashes and such. Having not to pass self to functions also makes it easier to refactor from functions to methods (ok, we know, in ruby every function is a method, but that’s not the point). This in the case that the method uses no instance variables.
But…
But Ruby treats variables and methods differently. Instance variables need a “special” syntax. Methods don’t. It’s not clear if a method is a function or not (of course it does the Right Thing)
irb(main):001:0] def foo; "foo"; end
=] nil
irb(main):002:0] class Bar; def foo; "dont foo" ; end
irb(main):003:1] def bar; foo; end
irb(main):004:1] end
=] nil
irb(main):005:0] b = Bar.new
=] #
irb(main):006:0] puts b.bar
dont foo
=] nil
As I said it does the Right Thing… I’m just talking about readability. Of course you can use self in ruby too… but well. This is something I would have preferred solved in another way, even if right at the moment I’m find quite acceptable to renounce to have “one way” for a bit more pragmatism.
In fact this is not a Ruby vs. Python list. I know not enought Ruby and I love Python to much :). It’s more a picture of the first impressions I had on Ruby after I seriously began studying it.
This is a quick list of thoughts. I’m probably adding more stuff later. In fact I’m reaaly amazed. Ruby is really hackish, but it’s also neat and clean. It may sound strange… but I’m afraid I’m gonna love ruby more than Python: it addresses many of the things I come to dislike in Python.
Some stuff I like
- Adding dynamically stuff to classes. The syntax is clean and obvious
class MyClass def aMethod puts "Called aMethod" end end m = MyClass.new begin m.aMethod2 rescue NoMethodError => ex puts "Failed to call #{ex}" end class MyClass def aMethod2 puts "Called aMethod2" end end m.aMethod2In Python for example it is not that clean. The ruby code looks more like ObjectiveC categories (even if in fact you don’t specify a category, of course, since ruby does not need categories).
- private, protected, public modifiers. Being dynamic they do not limit me in any way (if I need I can dynamically change this)
class MyClass protected def aMethod puts "Called aMethod" end end m = MyClass.new begin m.aMethod rescue NoMethodError => ex puts "Failed to call #{ex}" end class MyClass public :aMethod end m.aMethodMoreover I can’t think to a cleaner syntax to do this. Of course one can argue that they aren’t really necessary. Of course, but if you want they do give you a little bit of control, but they don’t limit you in any way.
- Object#freeze: if you suspect that some unknown portion of code is setting a variable to a bogus value, try freezing the variable. The culprit will then be caught during the attempt to modify the variable.
- I love postfixed control clauses. In fact if used with moderation they can augment code readbility (and of course if abused they make it less readable). However, it is pretty logigal to say do_something if something_else. In fact since ruby has blocks it is ok also to write
begin # some stuff end if conditionmaybe not everyone agrees
- Accessors are great. Using attr and similar constructs is great. I also find quite more readable to “assign” with foo= than using more Javesque setFoo(val). In fact properties in Python work in a similar fashion, even if maybe a bit more verbose (however, I think they make it clearer how to document code, but this is a ruby aspect I’ve not yet exploited)
- Gems: I think python needs something like that. Pimp is not that good, in my opinion, of course.
Stuff I don’t know if I like or not
- Not having to write parentheses with no-argument methods. I’ve not yet discovered if I like it or not.
- All the $_… They are used a lot in Perl. I like it, but I’m afraid it can mess things up.
Stuff I dislike
- I don’t like passing strings to require. In fact it can have advantages, but I prefer the pythonic import foo to require “foo”.
- The try catch semantic of try/catch, makes me think to some kind of disguised goto. I’m sure I’m wrong… but
- I’d like to have something like Perl use strict;. It’s something I miss in Python (even if with pychecker you can just live without it). Now I’ve got to find some kind of “use strict” or “rubycheck”.
- Assignment is an expression. This directly leads to code like
a = true b = false print b if b=a
and we imagine the programmer wanted to write
a = true b = false print b if b==a
However in Ruby almost everything is an expression, and it has quite a lot of advantages. So we have to tolerate the “=” for “==” problem.
The small script below uses PyObjC (it was born as a snippet in ipython to have a quick check to a pair of variables). Of course you can write the very same thing in ObjectiveC.
I strongly encourage to install PyObjC and ipython even if you do work with Cocoa and ObjectiveC, since you can use that to prototype your application and to test snippets of code.
#!/usr/bin/python
# -*- coding: utf-8 -*-
import sys
import Foundation
ud = Foundation.NSUserDefaults.standardUserDefaults()
d = ud.dictionaryRepresentation()
for k in d:
sys.stdout.write(k)
sys.stdout.write(":\t")
try:
print d[k]
except UnicodeEncodeError, e:
print d[k].encode(’utf-8′)
To have more information on PyObjC (that is to say Cocoa bindings for Python), go here.
You can find ipythonhere. ipython in an improved interactive shell, with powerful introspection capabilities.
Since MacOS X Python by default comes with no support for readline, I advise to install the missing module for Python 2.3 (search google, I don’t remember right now where to find it) or better install this version of Python 2.4, complete with the patch (on the same page).
Of course you must set your path so that the “first” python is the new one (if you set PYTHON_ROOT and such, you must also fix them).
Remember when you “python setup.py install” a module (that is the typical command line to install a python package), it is installed for the python version called by default (find it out with which pyhon)
Although you can easily do with module subprocess what I do with the following class:
- If you have Python 2.3 you may not have subprocess
- This class is much simpler than subprocess: it does just one thing.
- You may want to pipe two commands in a fancier way
I wrote this because I needed to use Python to do some system automation and I needed something that acted like Perl and bash “ (in bash 2 you are supposed to do $(command) instead of `command` but it’s syntactical sugar).
In my strive to be compatible with standard MacOS system (which du use 2.3 by default) I forgot it existed module subprocess. So I wrote this simple class.
Even if in production you may prefer to use subprocess you can consider this an example of streams in Python. This is a very simple example. If you want to really pipe two processes, use module subprocess. If you want to make streams in a C++ fashion you can take inspiration by this code, but you may want to do it line based (but this ain’t no problem, in fact we are using properties, so you can do whatever you want with them — an example later).
Stream acts like a Mixin (thanks Dialtone!). It defines __rshift__ and __lshift__ (that become >> and
class Stream(object):
def __rshift__(self, rho):
if isinstance(rho, Stream):
rho.input = self.out
return rho
else:
raise TypeError()
def __lshift__(self, rho):
if isinstance(rho, Stream):
self.input = rho.out
return self
else: raise TypeError()
Remember: classes that subclass Stream must have an attribute named out and one named input. If you need something more complex, use properties to implicitly call functions every-time you access out or in. For example since I use lazy evaluation I do:
input = property(fset=set_input, fget=get_input)
out = property(fget=get_out)
err = property(fget=get_err)
Now have a look at the whole code:
import os
class Stream(object):
def __rshift__(self, rho):
if isinstance(rho, Stream):
rho.input = self.out
return rho
else:
raise TypeError()
def __lshift__(self, rho):
if isinstance(rho, Stream):
self.input = rho.out
return self
else: raise TypeError()
class Execute(Stream):
def __init__(self, command, *kargs, **prefs):
self._commandString = ' '.join((command,) + kargs)
self.has_run = False
def _lazy_run(self):
if self.has_run: return
self.has_run = True
(self.sin, self.sout, self.serr) = os.popen3(self._commandString)
self.sin.write(self.input)
self.sin.close()
self.out_list = list([line.strip() for line in self.sout])
self.err_list = list([line.strip() for line in self.serr])
def run(self):
self._lazy_run()
return self
def _make_string(self, which):
self._lazy_run()
if hasattr(self, “%s_string” % which):
return
setattr(self, “%s_string” % which,
‘\n’.join(getattr(self, “%s_list”% which)))
def set_input(self, s):
if hasattr(self, “input”):
self.input_ = “%s%s” % (self.input, str(s))
else:
self.input_ = str(s)
def get_input(self):
try:
return self.input_
except AttributeError:
return “”
def get_out(self):
self._make_string(”out”)
return self.out_string
def get_err(self):
self._make_string(”err”)
return self.err_string
input = property(fset=set_input, fget=get_input)
out = property(fget=get_out)
err = property(fget=get_err)
def __str__(self):
return self.out
print Execute(”cat Execute.py”) >> Execute(”wc -l”)
print Execute(”wc -l”) << Execute(”cat Execute.py”)
If you need to do different things, redefine (get|set)_(out|input|err).
If you don’t know what GeekTool is go here.
It is very likely you’ll find it useful. If you use Tiger, use this version that fixes a lot of issues.
GeekTool allows you to put a lot of interesting informations on the desktop. You can “print” logfiles on the desktop or you can put there pictures or, and that is what is interesting, put the output of a chosen command.
For example I put a random fortune on the desktop. It easier to do download GeekTool and do it than reading an explanation.
An interesting feature that is in the documentation (so it’s something you probably wouldn’t read) is that scripts/commands placed in “~/Library/Application Support/GeekTool Scripts” need not to be specified with full path. So we will put the script “cpuload.py” in that directory and we will refer to it with “cpuload.py”
And now the script:
#!/usr/bin/python
# -*- coding: utf-8 -*-
import os
class command(object):
def __init__(self, c):
self._command = c
fi, foe = os.popen4(self._command)
fi.close()
self.out = list([line.strip() for line in foe])
foe.close()
def cpu_load(string_list):
# a bit functional… not pythonic at all
return sum(map(lambda l: float(l.split()[2]),
string_list), 0)
def main():
ulist = command(”ps -ux”).out[1:]
slist = command(”ps -aux”).out[1:]
print “System CPU: “, cpu_load(slist), “%”
print “User CPU: “, cpu_load(ulist), “%”
if __name__ == “__main__”:
main()
Class “command” is a stripped down version of a class I’m writing for another project. I also think that any script you write should be executable.

