Thursday, March 24, 2011

On multicore Stata MP performance

Since I am in the middle of some very interesting discussions and experiments concerning this topic: it really does look like the difference in speed between the different multi-core/multi-processor Stata/MP versions is considerable. Right now I am performing a comparison exercise* with my friend and co-author Miguel, where, it turns out, a number of certain estimations with a 8-core Stata/MP 11 take about 10 almost 15 times less real time (I know, I hardly believe it myself) than (almost) the same estimations done with a 2-core Stata/MP 11. Taking the median across all estimations commands in Stata, an 8-core will outperform a 2-core by a factor of 2.28 (NB: this ratio appears to be larger than the price ratio of these two multicore versions!): I computed this figure from other stats available in this 250 pages report on Stata MP's performance.

The next quest should be to assess Stata's bold claim: "From dual-core laptops to the big iron of multiprocessor servers, Stata gets the most out of multicore systems. No other statistical software comes close" (my emphasis in bold). If true, that surely ought to boost Stata's status in the statistics/econometrics community (part of which-- I plead guilty too-- is currently infatuated with using other programming languages like Ox, Fortran, Gauss, Matlab etc, while typically leaving Stata for simple exercises or data manipulation).

*brief update footnote here, for clarification (well, at least part of the clarification..., the huge, factor 10 15, execution time differential remains somewhat puzzling): we cannot really perform this comparative exercise directly, as "(almost) the same" here means  that we do have  identical specifications (and the same software and operating system), but nonetheless different  (very large) panel datasets-- and in this particular case the difference in the structure (specifically, the "connectedness" of the cross-sectional-time series data, but the intention was not to give too many details) of those datasets matters, beyond an eventual difference in  the number of observations (for our purpose virtually the same); if we ran identical specifications on exactly the same dataset, with the 8-core and respectively the dual-core Stata/MP, the speed ratio could not be higher than 8/2=4, the theoretical limit. Given the type of exercise performed here, the real time differential due solely to the difference in core numbers is most likely close to this limit.

