« November 2006 | Main | June 2007 »

May 2007 Archives

May 7, 2007

The MT v3.35 "upgrade"

Massive amounts of work need to be done to pull out unwelcome comments and trackbacks... hence, a new start for a new stage of life. Rather than taking the usual upgrade path, I went ahead and installed a clean v3.35 Movable Type. Over the next few weeks I will be incorporating old posts from the original Tim's Journal, as well as ranting anew. Welcome!

StyleCatcher installation notes, repositories

All-in-all, the v3.35 install went quite smoothly, though I did have some issues with getting StyleCatcher, the included style management plugin, to work properly. Looks like I'm not the only one... Symptom: Ability to select thumbnail from repository, reciept of successful install, but resulting in a blog with no style for most (though not all) selected styles.

The fix for me turned out to be very straight-forward, i.e., the Theme Root URL was incorrect. As with most MT installs, after unarchiving the tar.gz to the cgi-bin directory, we are instructed to re-locate the mt-static directory to an accessible location. StyleCatcher, unfortunately, is not saavy enough to check mt-config.cgi for the mt-static location. Instead it defaults (in my case) to .../torque/cgi-bin/mt/mt-static/themes rather than .../torque/mt-static/themes. Corrected, the plugin works as promised. The current template is Fleur by Jennifer Maloney.

Once you get it going, you might be interested in locating other StyleCatcher repositories. Here's my list (with appropriate Theme or Repository URL):

Note that some of these take longer to load than others. Have patience and enjoy! If you know of any more, let me know.

*Very much related to "thestylecontest".

May 8, 2007

TypeKey apologies

MovableType 3.x allows you to restrict commenting to those authorized by SixApart's TypeKey. After turning it on, I noticed that I was not only not able to comment (on my own entries), but getting the following unhelpful error:

Our apologies.

TypeKey has encountered an error. The requested page could not be found or the requested action could not be completed. Please check the URL carefully and try again. If you accessed this page from an email, make sure you cut and pasted the URL accurately.

This was resolved with some sleuthing -- special thanks to Paul Philippov. See answer in MovableType Knowledge Base. Log into TypeKey and scroll down to the bottom of Account Preferences. Under Weblog Preferences you should find a link to the blog URL in question. You must end with a trailing slash, i.e., "http://torque.gig8.com/". Lame.

Zojirushi NS-JCC18 rubber seal

Some time ago, the rubber seal on our rice cooker (Zojirushi NS-JCC18; attached to the bottom of the mushroom-looking object with the ball inside) fell apart while cleaning (fatigue probably due to repeated steaming while cooking rice). While I simply needed the rubber seal, it appears that the entire "steam vent case" requires replacement. The price is $4.20, plus $2.20 for s/h. To order the part by credit card, contact Connie Alvarez either by email or by phone at 1-800-733-6270, ext. 105 (or contact Jesse at 108 if not available). Or, send check or money order for payment, made out to Zojirushi America Corp., to:

Zojirushi
6259 Bandini Blvd.
Commerce, CA 90040
Attn: Customer Service

ns-jcc18.jpg

May 9, 2007

Omron 712C --> HEM790-IT with USB

Last Christmas, my parents sent me an Omron blood pressure monitor (uh...thanks). At first, I was a bit disgruntled, but this has turned out to be a wonderful gift! Since receiving it, I have been rather obsessively jotting down the numbers various times during the day. I keep meaning to get it into Excel, but it really has been too much of a pain. What I really want is some sort of auto-download into the PC. Unfortunately, my model, the Omron 712C, while recording the last N results, does not have any straight-forward way of getting data into a PC. The closest I came are the four pads right under the pressure sensor suggestively labeled GND, RXD, TXD, VCC. These are probably used to program the device.

IMG_6105.JPG

Enter the Omron HEM790-IT. About twice as much, but still under $100, it comes with "Health Management Software" which promises to checks for "morning hypertension" and supports 2 users. The most important piece, however, is the USB slot. The software manual suggests an MDAC and .NET requirement, which means the data should be accessible. Now if I can only get it to transmit wirelessly!

WTB 32GB NAND SSD drive

About a year ago, Samsung announced the Samsung Q1 (Q1P SSD 7" Ultra Mobile PC - Intel Pentium M ULV Processor, 1 GB RAM, 32 GB Flash Drive, Windows XP Tablet), an ultra-mobile computing device hosting a 32-Gigabyte (GB) NAND flash-based solid state disk (SSD). Promised: 300 percent faster reads (53 MB/s), 150 perent faster writes (28 MB/s), complete silence (except for the CPU fan...). I just want the SSD. How would it compare to a USB Flash drive of equivalent size? How would it compare to an SDRAM-based SSD?

Wikipedia discusses SSD availability:

  • Super Talent Technology Announced they will ship a 3.5-inch 160 GB Solid State Drive in April 2007.
  • Sandisk released a 32GB 2.5-inch solid state drive on March 13, 2007. The SSD SATA 5000 is being sold to computer manufacturers for $350.
  • Sandisk released a 32GB 1.8-inch solid state drive on January 4, 2007 (where is this?)
  • Taiwanese A-DATA introduced at the Las Vegas CES 2007 SSD drives at capacities of 32GB, 64GB (1.8" model) and 128GB (2.5" model). It is expected to be commercially available by mid-2007.
  • SimpleTech has announced a 64GB SSD that is only 9.5mm thick, half the size of competing SSDs. On April 18, 2007 SimpleTech announced 256GB capacity enterprise level drives available immediately and 512GB capacity drives available late 2007.
  • Adtron announced a 160 GB SATA SSD on February 20, 2007.
  • Samsung has upped the capacity of its Flash-based SSD line to 64GB on March 27, 2007
  • Dell has begun shipping ultra-portable laptops with solid state drives (SSD) on April 26, 2007

Storage benchmarks

Jason Phillips has a three-part series on hard drive benchmarking on the BleedinEdge. It is very thorough. Why does this matter? A poorly maintained hard drive decreases over-all system performance because apps use the drive-based page file when system memory is insufficient. This post summarizes the software reviewed.

  • Disk Bench - "How fast are my disks really. In a real life situation." Copies file from A to B, times it, then deletes file. Small installation file, very simple, few configuration options, .NET affects performance. (free, v2.5.0.3)
  • DiskSpeed32 - Tests drive cylinders, graph difficult to understand, long test (free)
  • Drive Speed Checker - Free trial nags, advertising, one button to start test, tests read/write speed, directory lookup speed (free trial, $4.99, v1.5.5)
  • HD Speed - Destructive write test (achtung!), measures sustained and burst data transfer rates in realtime (free, v1.5.2.61)
  • HDD Speed Test - Basic tests, option to disable system cache and pagefile (free, v1.0.11)
  • HD Tach - low level hardware benchmark, measures sequential read speed, random access speed, interface burst speed and CPU utilization of drive; registered version adds sequential write testing (free trial, v3.0)
  • Iometer - Originally developed by Intel for single and clustered systems, not user friendly, advanced, use to compare with advertised specs, used by manufacturers (free)
  • IOzone - not tested
  • MHDD - not tested (v4.6)
  • PassMark Performance Test - sequential rw, random seek+rw, easy to install and use, includes many other tests (free trial 30 days, $24, v6.1)

Flash RAID array

A trip to Fry's this afternoon and a chance encounter with a Lexar Lightning USB drive reminded me of an idea I had to RAID together USB flash drives. HardForum has a nice discussion and a reminder of throughput. USB is 480 Mbps (60 MB/s) and the fastest Flash is about 30-40 MB/s. Meanwhile, IDE is 133 MB/s. With regards to USB flash speeds, there is a difference. See the Moka5 USB Flash Drive Showdown for ratings. There are substantial differences between products. Pricing also varies substantially by site. The 2GB Lexar Lighting was $54.99 at Frys.

May 11, 2007

eDiscovery and the data transfer problem

Discovery is a legal device employed by a party in a civil or criminal action, prior to trial, to require the adverse party to disclose information essential to case preparation which only the other party knows or possess [1]. Electronic discovery or eDiscovery is simply discovery applied to electronic media, e.g., email, documents, spreadsheets, schematics, instant messenger logs, voice mail recordings. eDiscovery is growing. It is growing because the legal requirements are expanding [2]. It is growing because technology is making more and more data readily accessible and therefore discoverable. Forrester expects eDiscovery technology to grow from $1.4B in 2006 to $4.8B in 2011; the 2006 Socha-Gelbmann Electronic Discovery Survey estimates $1.8B in 2006 to $3.1B in 2008 [3]. Machine costs should range between 15-25%.

I have spent the last two years working for Gallivan, Gallivan & O’Melia (GG&O), a Seattle-based firm offers both software and consulting to law firms and enterprises for electronic discovery. GG&O’s capabilities (and hence my experience) range from forensic data acquisition to native document processing and review support through imaging for production. I functioned as the both technical and managerial lead for the Silicon Valley office.

During my tenure, a “standard” matter ballooned from several hundred gigabytes and several hundred thousand files to multiple terabytes and multiple-million files. Because of the volume of data involved, data transfer, from drive to drive and from drive to memory to CPU (for hashing, indexing), has become the primary bottleneck holding attorneys back from review. Generally, unlike gaming or many scientific pursuits, eDiscovery is not computationally intensive; performance is not CPU-bound; it is input/output (i/o) bound.

Other applications that are i/o bound include bioinformatics, Homeland security, financial (transactional) databases, and enterprise document management systems. For these, having increased data throughput can generally be categorized as “nice-to-have” or “do-it-only-when-it-becomes-cheap-enough”. The requirements of electronic discovery, in contrast, are business critical. The legal and financial pressures are tangible and quantifiable (especially when dealing with government agencies with three-letter acronyms!). As the volume of data increases, the machine time, mostly because of data transfer issues, can involve days. Even when RAM or Flash-based solid state drives (SSD) become available, the time require to transfer data will remain a limiting factor. The interesting thing is that the eDiscovery industry will technologically drive itself. As faster data transfer becomes available, because litigation is competitive and time is always of the essence, there will be uptake.

The data transfer bottleneck. The bottleneck for data transfer comes from the need to access the entire volume through a single interface. IDE and SATA interfaces range from 150-300 MB/s. (USB devices operate at approximately 60 MB/s, DRAM up to 2-3 GB/s.) Besides the interface, there is also the issue of sequential versus random i/o. Operating at the highest rates, a terabyte (TB) takes approximately 1-2 hours to sequentially transfer from one drive to another. When files are accessed randomly (and there are many small files), the transfer time could be extended as much as 3-4x. The random versus sequential discrepancy can be alleviated by using RAM or Flash SSDs, which have better ways of addressing the data. But this still will be throttled by the interface. Where is my 1 TB RAM computer?

References
1. West Publishing Company. and West Group., West's encyclopedia of American law. 1998, Minneapolis/St. Paul, MN: West Pub. Co. v. <1-12 >.
2. Federal Rules of Civil Procedure. 2006 [cited 2007 May 10]; Available from: http://www.law.cornell.edu/rules/frcp/.
3. Skjekkeland, A. eDiscovery Market Size. AIIM Knowledge Center Blog 2007 [cited 2007 May 10]; Available from: http://infogovernance.blogspot.com/2007/03/ediscovery-market-size-aiim-knowledge.html.


Tape restoration (forensic alchemy) and the new rules

Someone wise once jokingly advised me, “if you want to securely destroy data, back it up onto tape”. Having restored (or attempted to restore) a fair sampling of (potentially corrupt) tapes, I say this is not far from truth. From a technical standpoint, this is especially the case when absolutely no information can be gleaned from the client on what (deprecated) OS, what (legacy) software (or worse, combination of software) and which (ancient) tape drive. Indeed, tape restoration can be very much forensic alchemy! That is why it costs so much and can take so long. But, from a legal standpoint, does technically difficult-to-restore imply “off-the-hook”? Well, as your attorney will respond, it depends.

The freshly amended Federal Rules of Civil Procedure treat this issue in Rule 26(b)(2) [1]. It reads…

(2) Limitations.

(A) By order, the court may alter the limits in these rules on the number of depositions and interrogatories or the length of depositions under Rule 30. By order or local rule, the court may also limit the number of requests under Rule 36.

(B) A party need not provide discovery of electronically stored information from sources that the party identifies as not reasonably accessible because of undue burden or cost. On motion to compel discovery or for a protective order, the party from whom discovery is sought must show that the information is not reasonably accessible because of undue burden or cost. If that showing is made, the court may nonetheless order discovery from such sources if the requesting party shows good cause, considering the limitations of Rule 26(b)(2)(C). The court may specify conditions for the discovery.

(C) The frequency or extent of use of the discovery methods otherwise permitted under these rules and by any local rule shall be limited by the court if it determines that: (i) the discovery sought is unreasonably cumulative or duplicative, or is obtainable from some other source that is more convenient, less burdensome, or less expensive; (ii) the party seeking discovery has had ample opportunity by discovery in the action to obtain the information sought; or (iii) the burden or expense of the proposed discovery outweighs its likely benefit, taking into account the needs of the case, the amount in controversy, the parties' resources, the importance of the issues at stake in the litigation, and the importance of the proposed discovery in resolving the issues. The court may act upon its own initiative after reasonable notice or pursuant to a motion under Rule 26(c).

Ronni Abramson has a great article in Legal Tech discussing two recent rulings: Best Buy Stores L.P. v. Developers Diversified Realty Corp. and Ameriwood Industries Inc. v. Liberman [2]. Here are the bullets:

Best Buy Stores L.P. v. Developers Diversified Realty (DDR) Corp.

  • Best Buy alleges overcharges for insurance and maintenance, seeks documentation on how insurance charges were calculated
  • DDR fails to respond and thus waives objections
  • Best Buy files motion to compel
  • DDR argues in brief that processing would exceed $125,000 and to hold determination until all issues have been sorted out, offers no proof to support argument that tapes were not reasonably accessible
  • Magistrate Judge Jeanne Graham states "the Defendants offer no proof, aside from conclusory statements, about the cost to obtain documents from electronic archives. So this concern cannot shield the defendants from discovery here." Orders responsive docs in 28 days.
  • DDR files objection with U.S. District Court Judge David Doty requesting rolling productions
    • submits unsworn statement from Kroll Ontrack advising 102-122 days to restore all tapes,
    • submits affidavit by directory of technology one day late (because of illness) with number of tapes, 345, that tapes were used solely for disaster recovery, and that an outside vendor would be required,
    • submits cost estimates from Kroll for restoration, filtering and processing (before review) – between $288,300 and $468,100 (~$1,000 per tape).
  • Doty unconvinced, upholds production order and timeline
  • DDR files motion for reconsideration.
  • Graham denies arguments that DDR was not aware of costs or delays and could not have presented evidence to support objections earlier.

Ameriwood Industries Inc. v. Liberman

  • Ameriwood alleges that Liberman, while employed, used confidential information to sabotage business relationships.
  • Liberman claims that lost sales were due to Ameriwood mismanagement, requests production of documents to show mismanagement.
  • Ameriwood produces some documents, objects that requests are “overly broad and unduly burdensome”
  • Liberman motions to compel, requests all responsive documents within a date range
  • Ameriwood argues that request would result in reviewing hundreds of thousands of documents, submits affidavit from forensics firm detailing that
    • the firm had collected responsive emails sent within the daterange for 23 former and current employees into a database
    • calculated that the emails within the database numbered in the hundreds of thousands
    • calculated that the emails for the six employees identified by Liberman would result in 60,000 emails and attachments
  • Judge rules that requested information is not reasonably accessible (because of review volume, not necessarily technical concerns), also that Liberman did not have a sufficiently narrow request.

The eDiscovery theme here is crystal clear: know what you have, know what it’ll cost, and for goodness sake, buy and submit the affidavit. These rulings suggest (and set precedence) that the law will not be kind to ignorance or procrastination.

On a different, and slightly skeptical note, consider the following: revenue for Best Buy FY2007 (ending 5/07) was $37B, revenue for DDR FY2006 (ending 12/06) was $0.8B. Big guys are up 2-0. Hmmm.

References
1. Federal Rules of Civil Procedure. 2006 [cited 2007 May 10]; Available from: http://www.law.cornell.edu/rules/frcp/.
2. Abramson, R. Judges Rule on Hard-to-Discover Data. Legal Technology 2007 May 10 [cited 2007 May 11]; Available from: http://www.law.com/jsp/legaltechnology/pubArticleLT.jsp?id=1178701485189.

May 13, 2007

The new "forever" stamp, see DMM 101.1.2

foreverstamp.jpg

Tomorrow, the U.S. Postal Service will be releasing the First-Class forever stamp (Liberty Bell, Tom Engeman - thanks kimmco). The sales pitch is "buy them now -- use them forever". Supposedly, never again will you be scrounging around trying to figure out whether that (ancient) 39 cent stamp of yours is going to work, or if you will be needing some "bonus" stamps. Did I say 39 cents? The forever stamp is 41 cents... for 1 oz.

But hold your ponies, before you go hording forever stamps for your wedding invitiations, the keyword here is "shape". The other new motto is "shaping a more efficient future in mail". According to the rate sheet, "letter-rate pieces that meet one or more of the nonmachinable characteristics in DMM 101.1.2 are subject to the nonmachineable surcharge (see 133.1.10)." Here are the rules:

  1. Dimensional Standards for Letters
    Letter-size mail is:
    1. Not less than 5 inches long, 3-1/2 inches high, and 0.007-inch thick.
    2. Not more than 11-1/2 inches long, or more than 6-1/8 inches high, or more than 1/4-inch thick.
    3. Not more than 3.5 ounces."
    4. Rectangular, with four square corners and parallel opposite sides. Letter-size, card-type mail pieces made of cardstock may have finshed corners that do not exceed a radius of 0.125 inch (1/8 inch).
  2. Nonmachineable Criteria
    A letter-size piece is nonmachinable (see 6.4) if it has one or more of the following characteristics (see 601.1.4 to determine the length, height, top, and bottom of a mailpiece):
    1. Has an aspect ratio (length divided by height) of less than 1.3 or more than 2.5.
    2. Is polybagged, polywrapped, or enclosed in any plastic material.
    3. Has clasps, strings, buttons, or similar closure devices.
    4. Contains items such as pens, pencils, or loose keys or coins that cause the thickness of the mailpiece to be uneven (see 601.11.18, Odd-Shaped Items in Paper Envelopes).
    5. Is too rigid (does not bend easily when subjected to a transport belt tension of 40 pounds around an 11-inch diameter turn).
    6. For pieces more than 4-1/4 inches high or 6 inches long, the thickness is less than 0.009 inch.
    7. Has a delivery address parallel to the shorter dimension of the mailpiece.
    8. Is a self-mailer with a folded edge perpendicular to the address if the piece is not folded and secured according to 201.3.13.1.
    9. Booklet-type pieces with the bound edge (spine) along the shorter dimension of the piece or at the top, unless prepared according to 201.3.13.

Watch out if you are planning on sending wedding invitations... that's +0.17 cents if you blow a nonmachineable spec (don't send square invites!). If you exceed the thickness, you are no longer looking at a "letter", you are looking at a flat.

One last jab. You can almost see where this is going. While the forever stamp will be "good for mailing your First-Class letters forever," I didn't get see any docs defining a First-Class letter OR what shape it is to eternity. Think about that!

May 15, 2007

American Express TrueEarnings (TruEarnings) Card

AMEX_Consumer_L.jpg

Received invitation to TureEarnings Card from Costco and American Express from Jud Linville (President, Consumer Cards, American Express):

  1. The pitch
    1. 5% annual rebate on automobile gas at Costco Gasoline and stand-alone gas stations
    2. 3% cash back on any restaurant (including takeout and delivery)
    3. 2% cash back for traveling (airline tickets, hotel statys, car rentals, cruises)
    4. 1% everywhere else, including at Costco
    5. No annual fee
  2. The fine print
    1. Balance transfer APR 16.74%
    2. Cash advance APR: 23.24%
    3. Default APR: 30.25%
    4. If the Default APR is applied (if minimum payments not timely paid two or more times), it will apply for a minimum of 12 consecutive billing periods beginning with the current billing period.
    5. Any promotional APR will terminate if you fail to pay any Minimum Amount Due by its Payment Due Date, or upon any condition that causes a default or other penalty rate to apply to your Account (emphasis mine)
    6. Variable rate for balance transfers is determined each billing period by adding 8.49% to Prime Rate, for cash advances +14.99%, for dfaulted accounts is 21.99%
    7. "We may apply payments and credits first to your balances with lower APRs before balances with higher APRs."
    8. Late Payment Fee: "Subject to applicable law," $15 on balances less than $400 and $35 on balances $400 or greater
    9. Overlimit fee: $29
    10. Rebates awarded every February in the form of an in-store coupon redeemable for merchandise or cash at any U.S. Costco warehouse
    11. To receive the rebate your account must be open and you must present the coupon at Costco prior to the coupon expiration date (emphasis mine)

How does this compare with the original American Express Costco Cash Rebate Credit Card? According to Frugal, "AMEX Costco Cash Rebate and Blue Cash have tiered structure, and you will need to spend $11000 and $13000 a year separately to beat a flat 1% cash rebate card." Also visit wresnick's epinion review:

Costco accepts American Express as its only credit card. The advantage of the original card was that it gave a cash rebate, but the rebate was tiered. The new card gives a one percent cash back bonus at Costco, and on most eligible purchases. However, travel expenses earn a 2% rebate, and restaurants expenses earn a 3% rebate. The old card gave back .25% on the first $2000, .5% on the next $3000, and 1.5% on amounts above that, if you paid in full each month. So if you charged $2000 per year, you got a $5 rebate. The rebate went up by .5% for those who carry a balance, but that's not really cash back so much as a factor to be considered in conjunction with the interest rate, which is more than .5% higher than some other cards.

Assuming that you don't use the card at restaurants or for travel, you would have to charge $11,000 per year on the old card to break even with the 1% rebate on the new card. The break even point is even higher if you would use the new card for restaurants and travel, and would depend on your spending habits. For example, if you eat out at restaurants twice a month on average, and spend $50 for an average meal, then the break even point would be $15,800. For some users, the original card may be a better deal. But considering the extent to which American Express is accepted, and considering the rebates on other types of cards, it may not be the best option to use American Express for some purchases.

How to mount a USB key under Solaris 10 x86

Had to do this today, and it took a few tries. Here's what you need to do (incomplete original):

  1. Insert USB key (ok if machine is alive)
  2. "iostat -En" will tell you where to find your USB device, e.g.,
    c5t0d0 Soft Errors: 2 Hard Errors: 0 Transport Errors: 0 Vendor: USB 2.0 Product: Flash Disk Revision: 1.00 Serial No: Size: 0.13GB <130023424 bytes> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 2 Predictive Failure Analysis: 0 --> c5t0d0
  3. "devfsadm -C"
  4. "mount -F pcfs /dev/dsk/c1t0d0p0:c /mnt/" -- Make sure you write "dsk" and not "rdsk". Also, you must have trailing slash on the target directory!!
  5. "umount /mnt" when finished

Amazingly, it actually works. My USB key was formatted FAT32, and I suspect this technique should also work for any other kind of USB storage. If you confirm, let me know.

May 16, 2007

Blazing fast upload at the library

In most home and business DSL or cable configs, the download bandwith is reasonably fat (2-3 MBps), at least fatter than a T1. However, uploads are typically quite weak. I found the (non-university) place to upload data, the Mountain View library: a zippy 5 Mbps to San Jose! Surprisingly asymmetric in the opposite way, this is where you might go if you need to upload your home videos to YouTube, or send data to a co-location facility. It makes sense, since most of the users are pulling data down, rather than pushing data up.

The trip to NY is a bit weaker...

But not bad to LA.

May 24, 2007

Unable to load source file - GeoRSS for Virtual Earth API

I was investigating some issues this morning integrating dynamically generated GeoRSS with Microsoft's Virtual Earth API. Turns out that the issue is that the API requires appropriate MIME-type for GeoRSS, even when the conent itself is XML. There is an answer...

I think this is simply a matter of how you are setting the MIME type when you generate the file. Here (http://www.geograph.org.uk/syndicator.php?format=GeoRSS) is a sample feed I found on the GeoRSS.org web site. Notice that it is a .php page that auto-generates a GeoRSS file (no .xml extension). It works in my VE mashup without a problem.

The PHP solution is straight-forward. Before spilling out the XML, include this code"

$charset = "iso-8859-1";
$mime = "application/xml";
header("Content-Type: $mime;charset=$charset");

May 29, 2007

Scorchin' in-memory db benchmark with 1 TB RAM

I've been investigating machines with large amounts of RAM for quite a few months now. I'm convinced that the distributed-centralized computing paradigm oscillates, and that the next few years will bring about monster computers with, in particular, tons of RAM, but maybe fewer processors. My sleuthing this week led to the Altix 4700. Emerging out of the ashes of bankruptcy, it is SGI's new dream machine. They boast a mind-numbing 128TB of shared memory capacity decoupled from the number of processors. This is important. For a number of years, SGI has had their NUMA* technology, which essentially promises shared access to all RAM by all processors. In most multi-processor machines, each processor really has direct access to about 16GB RAM or so (multiplied by number of cores...). If you've ever done any sort of parallel computing, it makes a difference. Essentially you have N machines each with M GB of RAM, instead of N processors sharing N*M GB of RAM. Think about inverting matrices... or hosting databases in memory.

The scoop today is a database benchmark on the University of Louisiana's 160-core Altix 4700. In particular, the database is an in-memory database system (IMDS), which means that the database is entirely hosted in memory. This is important, because there are a lot of optimizations that SQL Server, MySQL, Oracle, do to compensate for the fact that databases sit on slow hard drives and memory is hard to come by. Fancy things like memcaches, etc. These all go away if you have enough memory to store the whole thing simply in RAM. Here are the goods:

For a simple SELECT against the 1.17 Terabyte, 15.54 billion row database, eXtremeDB-64 processed 87.78 million query transactions per second using its native application programming interface (API) and 28.14 million transactions per second using a SQL ODBC API. To put these results in perspective, consider that the lingua franca for discussing query performance is transactions per minute; In more complex JOIN operations, the benchmark report documents performance of 11.13 million operations per second with the native API, and 4.36 million operations per second using SQL ODBC; [emphasis mine]

In case you were wondering, IMDS speed >> database hosted on RAM drive. (I also asked this question.) You would expect it to be a little bit faster, but it turns out that all the extraneous operations when you don't plan on being in RAM really take up a lot of time. Of course, the benchmark comes directly from McObject's marketing department, so YMMV."

About May 2007

This page contains all entries posted to Tim's Journal in May 2007. They are listed from oldest to newest.

November 2006 is the previous archive.

June 2007 is the next archive.

Many more can be found on the main index page or by looking through the archives.

Creative Commons License
This weblog is licensed under a Creative Commons License.
Powered by
Movable Type 3.35