Tuesday, December 23, 2008

Cloud Computing Conference in 2009

The Cloud Slam Conference is the world's premier cloud computing event, covering research, development, innovations and education in the world of cloud computing. The Technical Program is unmatched, and reflects the highest level of accomplishments in the cloud computing community, while the invited presentations feature an exceptional lineup of speakers. The panels, workshops, and tutorials are selected to cover a range of the hottest topics in cloud computing.

Descriptions of our conference tracks are presented below.

Technology
Data Centers, HPC, Cloud Storage, Hardware/Equipment, Software, Platforms, Virtualization, SaaS, E-government (eScience and eEducation), Security, Monitoring, Distributed Technologies, Data in Cloud, CDN

Industry Implementation Experience
Learn about experiences transitioning traditional IT resources into the cloud, efforts to use the cloud to support business processes that have been traditionally supported by IT. Usage/ industry focus such as application to and lessons learned from use in institutional and / or retail financial businesses, life sciences, manufacturing, retail, medicine, pharmaceutical, etc. Discussion of practical usage and issues discovered/addressed could help in promoting a shift in perception of Cloud Computing.

Business Models
This track discusses how to make money in the cloud - for startups, incumbents and Venture Capital/Investments. How to make a career in Cloud Computing(recommended for for HR/recruiters/candidates/institutions/courses).

Legal Aspects: Compliance, Privacy
What is personal information (the definition is quite broad, but varies from jurisdiction to jurisdiction)? What is privacy? (a.k.a. "information self-determination" i.e. the right or ability of indivudals to exercise a measure of control over the collection, use and disclsoure of their personal information by others). Where is the personal data stored (profound implications for jurisdiction and applicable laws, transparency, accountability, recourse, etc.).
What laws apply to the data in question (privacy, security, business transaction, consumer prrotection, etc). Is the data secure? (How would you know? What assurances are available? Does assurance level depend on sensitivity? Is it auditable?).
Is individual consent provided? How? is it informed? Can it be revoked? Can conditions be attached?).
Who owns the data? (In the U.S., it is exclusively the organziation's, everywhere else, ownership is a "shared responsibility")
Who (e.g. third parties, agents) is the data shared with? Under what circumstances?
Are there data breach disclosure requirements in place - if a breach occurs will you be informed? What remedies, if any, will be offered?
What steps or measures can people and organizations to limit exposure and lability?
In this Information era of unlimited storage, mirrors, backups, etc - has data deletion (when no longer needed for its original purpose) become an obsolete idea?

Research
Presentations from academia on the current and future aspects of cloud computing(e.g. running MATLAB/Star-P/Mathematica parallel calculations for life sciences projects in cloud environments or storing data sets such as the Human Genome, U.S. Census and labor statistics in cloud to make the information easier to access for researchers).
We invite you to propose a presentation for Cloud Slam 2009. The deadline for abstract submissions is Monday, January 19, 2009. Click here for details on submission guidelines.


For more information see website at http://cloudslam09.com


Tuesday, September 23, 2008

Integration of workflow, rules and monitoring in compute cloud


This is what I have planned to describe long ago, but didn't due to many reasons. While working with provisioning workflows in virtualized environments (we had only Xen in 2005), I have seen a need to separate workflows and rules, so it is easy to maintain business logic. 
Take a look at sequence diagram below:






It describes transaction from capturing event from sensor network (in this case gmond metrics) and executing arbitrary code in response to specified condition.



Wednesday, September 17, 2008

Ocean floor data center



During discussion of floating data centers at Cloud Computing Group,
I've come to an idea of placing data centers in proximity to trans-ocean cable landings on ocean floor. Crazy, huh ? :)


It might be started at the border of neutral waters, around 5-10 km from shoreline and later expanded along major submarine cable systems.



Perhaps, initially it might be non-maintainable sealed containers with racks, powered by either


a) elements using natural forces like waves or streams
b) galvanic elements
c) hydrogen (there's definitely no lack of water)

and cooled by external medium (also water).

ps. Original images are from this really cool site.





Update on November 14, 2010.
Here's another technology to get electric power for DC.

Swedish company Minesto's underwater kite resembles a child's toy as it swoops and dives in ocean currents. But since seawater is 800 times as dense as air, the small turbine attached to the kite — which is tethered to the ocean floor — can generate 800 times more energy than if it were in the sky. Minesto calls the technology Deep Green and says it can generate 500 kilowatts of power even in calm waters; the design could increase the market for tidal power by 80%, the company says. The first scale model will be unveiled next year off the coast of Northern Ireland.




Monday, June 2, 2008

Patterns in Cloud Computing

Recently I posted a small piece of my thoughts on cloud computing in mailing list ( http://groups.google.ca/group/cloud-computing/browse_thread/thread/86ee2f8f4afd71c1# ) on patterns in cloud computing, comparing on demand computing resource pools to human resources.
Indeed, both target same goal: descrease fixed costs of ownership and have more space for manoeuvre. Though it has some differences: computing resources (CPU, RAM, disk storage etc) are of known quality, while people (consultants) not always.

Another thing, that was lurking in my mind, is that if you compare compute cloud to an Earth subsystem like hydrosphere (which is actually fed by precipitation from atmosphere, read clouds), then it has similar properties like streams, pollution, draining, freezing etc.

Freezing of storage resources might be compared to loss of elasticity, like inability of some storage space to efficiently store optimal amount of data. For example 1Gb of storage has 10 copies of same file (or slightly different versions) under different name. I also see some dynamic process eating RAM, which reminds me of water changing physical state (steam or geyser, if you want).

This might be solved by process of deduplication, so storage unit gets back its original properties (defragmentation comes to mind as well).

I don't know yet, whether these restoration/optimization processes should belong to system (compute cloud solution) or to the entities, that run inside cloud, future will show.

Jumping aside, I would also note a popular Observer pattern, used in publish/subscribe system, that I suggested to use for cloud implementations in some other posts to the same discussion list, for the purpose of reducing bandwidth/resources consumption.

I'll get you informed further in this thread. If you think, that I'm crazy - comments are welcome.

Tuesday, May 27, 2008

Alfresco Cluster in Compute Cloud (Amazon EC2)

Synopsis: This article describes simplified process of setting up Alfresco Cluster in Cloud Computing environment (Amazon Elastic Compute Cloud).

Pre-requisites.
  • You need to have Amazon Elastic Compute Cloud account (make sure you have files like pk-ABCDAABCDAABCDAABCDAABCDAABCDAABCDA.pem and cert-ABCDAABCDAABCDAABCDAABCDAABCDAABCDA.pem )
  • Basic knowledge of Alfresco CMS
  • Some knowledge of Amazon EC2 tools
Part 1. Introduction

What is Alfresco and why it is cool.

Alfresco is the leading open source alternative for enterprise content management. The open source model allows Alfresco to use best-of-breed open source technologies and contributions from the open source community to get higher quality software produced more quickly at much lower cost.

The Benefits of Using Alfresco

  • Ease-of-Use
  • Intelligent Virtual File System – As simple to use as a shared drive through CIFS, WebDAV or FTP
  • Google®-Like Search and Yahoo!®-Like Folder Browsing

Developer Productivity

  • Aspect Oriented Rules Development through Simple-to-Use Wizards
  • Rules and Actions Managed in the Server once for all Interfaces


Best-Practice Collaboration

  • Pre-Configured Smart-Space Templates – Project Structure, Content, Logic, Lifecycles
  • Forums – Threaded Discussions on Folders or Documents


Administrator Productivity

  • Simple Server Install and No Client Install
  • Advanced Content Security Management
  • Advanced Search/Knowledge Management
  • Sophisticated Content, Attribute, Location, Object Type and Multiple Taxonomy/Category

Search

  • Distributed Architecture
  • Highly Scalable and Fault Tolerant Service Oriented Architecture


Open Source

  • Dramatically Lower Cost

What is Amazon EC2 and why it is also cool.

Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable compute capacity in the cloud. It is designed to make web-scale computing easier for developers.
Amazon EC2's simple web service interface allows you to obtain and configure capacity with minimal friction. It provides you with complete control of your computing resources and lets you run on Amazon's proven computing environment.

Amazon EC2 reduces the time required to obtain and boot new server instances to minutes, allowing you to quickly scale capacity, both up and down, as your computing requirements change. Amazon EC2 changes the economics of computing by allowing you to pay only for capacity that you actually use. Amazon EC2 provides developers the tools to build failure resilient applications and isolate themselves from common failure scenarios.


Why a combination of both is uber-cool.


Benefits of the cloud allow you to provision preconfigured Digital Assets Repository on demand and scale it dynamically, thus reducing total ownership costs considerably.

Part 2. Getting your Cluster in 15 minutes.

Run management instance.

I use ElasticFox extension to do basic tasks on EC2. Run instance of Amazon Machine Image ami-68af4b01. After state of the instance changes from pending to running, you need to get Public DNS and put it into browser address bar. This should bring Web User Interface for Alfresco Cluster Management. You can login with following credentials: username alfresco, password alfresco. Now you should have left navigation with options:

* Add New Alfresco Cluster

* Alfresco Cluster Home

* Cluster Provisioning Status

* General Settings

* List All Alfresco Clusters

* Provision Alfresco Cluster Nodes

Setup EC2 credentials.

Using client like Putty or other ssh client, log in to Public DNS address and download your key files to dedicated path (e.g. /root). Now you need to edit following file /var/www/sites/all/modules/alfresco_ec2/ws-amazon-ec2-client.php and add names of your key files.

Provision minimal cluster.

At this point you are ready to provision your minimal 3-node Alfresco Cluster. Proceed to web interface and choose Add New Alfresco Cluster option. Fill in the form and choose List All Alfresco Clusters option in the left navigation to see your new cluster. To actually provision nodes to it, click on radio button for specific cluster and press submit button 'Provision Nodes for Selected Cluster Now'. This will lock browser for pretty long time (around 6-7 minutes). I have modified version of software, that handles this in a more elegant fashion. After it finishes provisioning transaction, you'll see results on the screen.

Test cluster functionality.

See ElasticFox extension and note new instances, that are respectively, DB Node, Master Node and Slave Node. Get its Public DNS addresses and open it in browser (it should like http://ec2-00-100-200-100.compute-1.amazonaws.com:8080/alfresco). Log in to one of the instances (or re-log in) as admin:admin and Add Content. Then log in to another instance and see if content of the space shows new file.

Conclusions.

This setup allows one click provisioning of Alfresco Cluster on EC2 and provides enterprise class digital assets managment system, that can be used for multiple purposes.

Future Work.

Scale-On-Demand.

High-Availability Configuration.

Media Streaming Solutions.

LDAP integration.

Single Sign On/Out.

Contact.

If you have any questions, please write me an email at sapenov at gmail dot com.

Wednesday, April 9, 2008

Cloud Computing

Cloud Computing Group in LinkedIn has been created.
If you are involved in areas of parallel calculations, high performance computing and related fields or just interested in future directions of subject, please join the group at http://www.linkedin.com/e/gis/61513/6213F13BB1AA

or write to join.cloud.computing@gmail.com

For discusssion join mailing list at http://groups.google.ca/group/cloud-computing

Archive of posts http://computingondemand.blogspot.com/

Wednesday, February 13, 2008

Setting Up Windows 2008 in Virtual Box

Synopsis: This article describes how to setup networking and activation in Microsoft Windows 2008 Server Std running under VitualBox.


If you have installed Windows 2008 Server and can't pass activation step, this howto is for you.

Algorithm - short description:

1. Run explorer.
2. Get VirtualBox Guest Additions installed.
3. Install Network Driver from VirtualBox cdrom disk and reboot.
4. Activate Windows.

Detailed description:

For step 1 read and use this article http://www.petri.co.il/bypass-windows-server-2008-activation.htm

Step 2 is fairly easy - just click 'Install ...' option in dropdown of VirtualBox and inside guest os accept autorun option.

Step 3 is as easy as updating driver and pointing to CDROM drive with VirtualBox guest additions directory called something like amd_net

Step 4 might require reboot and then just follow screen instructions.

At this point you should have Windows 2008 server running under VirtualBox.

Friday, February 1, 2008

Benchmarking CouchDB - READ 6000

[root@domU-12-31-35-00-3D-82 ~]# ab -t100 -n 6000 -c 6000 http://ec2-67-202-22-9.compute-1.amazonaws.com:5984/
This is ApacheBench, Version 2.0.41-dev <$Revision: 1.141 $> apache-2.0
Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright (c) 1998-2002 The Apache Software Foundation, http://www.apache.org/

Benchmarking ec2-67-202-22-9.compute-1.amazonaws.com (be patient)
Completed 600 requests
Completed 1200 requests
Completed 1800 requests
Completed 2400 requests
Completed 3000 requests
Completed 3600 requests
Completed 4200 requests
Completed 4800 requests
Completed 5400 requests
Finished 6000 requests


Server Software: inets/develop
Server Hostname: ec2-67-202-22-9.compute-1.amazonaws.com
Server Port: 5984

Document Path: /
Document Length: 215 bytes

Concurrency Level: 6000
Time taken for tests: 9.224306 seconds
Complete requests: 6000
Failed requests: 182
(Connect: 0, Length: 182, Exceptions: 0)
Write errors: 0
Non-2xx responses: 5870
Total transferred: 2145510 bytes
HTML transferred: 1269980 bytes
Requests per second: 650.46 [#/sec] (mean)
Time per request: 9224.306 [ms] (mean)
Time per request: 1.537 [ms] (mean, across all concurrent requests)
Transfer rate: 227.12 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 4 509 1078.4 10 7714
Processing: 8 1118 2672.2 69 8908
Waiting: 4 1003 2562.0 34 8896
Total: 12 1627 2844.8 97 9208

Percentage of the requests served within a certain time (ms)
50% 97
66% 453
75% 3028
80% 3103
90% 8319
95% 8571
98% 8966
99% 9039
100% 9208 (longest request)
[root@domU-12-31-35-00-3D-82 ~]#

Benchmarking CouchDB - READ 2000

[root@domU-12-31-35-00-3D-82 ~]# ab -k -t100 -n 3000 -c 2200 http://ec2-67-202-22-9.compute-1.amazonaws.com:5984/
This is ApacheBench, Version 2.0.41-dev <$Revision: 1.141 $> apache-2.0
Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright (c) 1998-2002 The Apache Software Foundation, http://www.apache.org/

Benchmarking ec2-67-202-22-9.compute-1.amazonaws.com (be patient)
Completed 300 requests
Completed 600 requests
Completed 900 requests
Completed 1200 requests
Completed 1500 requests
Completed 1800 requests
Completed 2100 requests
Completed 2400 requests
Completed 2700 requests
Finished 3000 requests


Server Software: inets/develop
Server Hostname: ec2-67-202-22-9.compute-1.amazonaws.com
Server Port: 5984

Document Path: /
Document Length: 215 bytes

Concurrency Level: 2200
Time taken for tests: 2.830302 seconds
Complete requests: 3000
Failed requests: 0
Write errors: 0
Non-2xx responses: 3009
Keep-Alive requests: 0
Total transferred: 1073998 bytes
HTML transferred: 646720 bytes
Requests per second: 1059.96 [#/sec] (mean)
Time per request: 2075.555 [ms] (mean)
Time per request: 0.943 [ms] (mean, across all concurrent requests)
Transfer rate: 370.28 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 1 11 26.9 2 115
Processing: 3 12 22.6 5 168
Waiting: 1 9 21.8 3 165
Total: 5 23 41.6 7 280

Percentage of the requests served within a certain time (ms)
50% 7
66% 9
75% 9
80% 36
90% 68
95% 126
98% 188
99% 203
100% 280 (longest request)
[root@domU-12-31-35-00-3D-82 ~]#

Benchmarking CouchDB - READ

etch:~# ab -k -n 210 -c 140 http://ec2-67-202-22-9.compute-1.amazonaws.com:5984/
This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 2006 The Apache Software Foundation, http://www.apache.org/

Benchmarking ec2-67-202-22-9.compute-1.amazonaws.com (be patient)
Completed 100 requests
Completed 200 requests
Finished 210 requests


Server Software: inets/develop
Server Hostname: ec2-67-202-22-9.compute-1.amazonaws.com
Server Port: 5984

Document Path: /
Document Length: 44 bytes

Concurrency Level: 140
Time taken for tests: 0.412173 seconds
Complete requests: 210
Failed requests: 52
(Connect: 0, Length: 52, Exceptions: 0)
Write errors: 0
Non-2xx responses: 52
Keep-Alive requests: 0
Total transferred: 60434 bytes
HTML transferred: 18132 bytes
Requests per second: 509.49 [#/sec] (mean)
Time per request: 274.782 [ms] (mean)
Time per request: 1.963 [ms] (mean, across all concurrent requests)
Transfer rate: 143.14 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 71 53.7 110 118
Processing: 0 130 92.1 142 261
Waiting: 0 129 92.3 141 260
Total: 0 202 126.9 252 377

Percentage of the requests served within a certain time (ms)
50% 252
66% 257
75% 264
80% 270
90% 369
95% 373
98% 376
99% 376
100% 377 (longest request)
etch:~# ab -k -n 210 -c 130 http://ec2-67-202-22-9.compute-1.amazonaws.com:5984/
This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $> apache-2.0
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Copyright 2006 The Apache Software Foundation, http://www.apache.org/

Benchmarking ec2-67-202-22-9.compute-1.amazonaws.com (be patient)
Completed 100 requests
Completed 200 requests
Finished 210 requests


Server Software: inets/develop
Server Hostname: ec2-67-202-22-9.compute-1.amazonaws.com
Server Port: 5984

Document Path: /
Document Length: 44 bytes

Concurrency Level: 130
Time taken for tests: 0.538343 seconds
Complete requests: 210
Failed requests: 0
Write errors: 0
Keep-Alive requests: 0
Total transferred: 68900 bytes
HTML transferred: 11440 bytes
Requests per second: 390.09 [#/sec] (mean)
Time per request: 333.260 [ms] (mean)
Time per request: 2.564 [ms] (mean, across all concurrent requests)
Transfer rate: 124.46 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 22 130 70.0 134 257
Processing: 27 139 62.5 138 261
Waiting: 2 128 67.3 124 257
Total: 240 269 18.4 283 284

Percentage of the requests served within a certain time (ms)
50% 283
66% 283
75% 283
80% 284
90% 284
95% 284
98% 284
99% 284
100% 284 (longest request)
etch:~#

Thursday, January 24, 2008

Benchmarking CouchDB

After installation of CouchDB, I was wondering about it's performance, so I made some benchmarking, and it looks pretty good to me - 2,000 concurrent requests.

I only had to raise number of open files per process by using

echo "204854" > /proc/sys/fs/file-max

ulimit -n 200000

to get rid of this message
OS error code 24: Too many open files

I'll publish results as Google docs, so they are posted to this blog automatically and I'll provide just a link to results.

This are read tests :

http://ihatecubicle.blogspot.com/2008/02/benchmarking-couchdb-read.html

http://ihatecubicle.blogspot.com/2008/02/benchmarking-couchdb-read-2000.html

http://ihatecubicle.blogspot.com/2008/02/benchmarking-couchdb-read-6000.html

I'm working on creation of files with sample data to emulate document for insertion, so we can test write efficiency.

Wednesday, January 23, 2008

Amazon EC2 - server response 400 (Bad Request)

etch:/mnt# ec2-upload-bundle -b khaz_debian_elasticdrive041_02 -m /mnt/image.manifest.xml -a key -s secretkey

Setting bucket ACL to allow EC2 read access ...
Error: could not create or access bucket khaz_debian_elasticdrive041_02: server response 400 (Bad Request)
ec2-upload-bundle failed

trying --debug option showed, that create bucket operation fails.

Searching Amazon EC2 forums didn't give anything valuable, so I'd tried to create bucket manually and voila - it spits error that sound like "You've tried to create more buckets than allowed". So deleting some garbage buckets helped to get rid of this hard to trace error (at least for me).

Setup Elastic Drive in Jeos/VMWare

Sometimes if you try to run Elastic Drive

root@khazJeosED:/opt/e# sudo elasticdrive /etc/elasticdrive.ini -d

and get following error:

Traceback (most recent call last): File "/usr/bin/elasticdrive", line 5, in pkg_resources.run_script('ElasticDrive==0.4.1-FREE-5-r54', 'elasticdrive') File "build/bdist.linux-i686/egg/pkg_resources.py", line 448, in run_script File "build/bdist.linux-i686/egg/pkg_resources.py", line 1173, in run_script File "/usr/bin/elasticdrive", line 175, in
File "/usr/bin/elasticdrive", line 156, in main
File "build/bdist.linux-i686/egg/elasticdrive/app_server.py", line 99, in start_drives File "build/bdist.linux-i686/egg/elasticdrive/s3_simple.py", line 57, in __init__ File "/usr/lib/python2.5/site-packages/boto-0.9d-py2.5.egg/boto/s3/connection.py", line 103, in create_bucket raise S3ResponseError(response.status, response.reason, body) boto.exception.S3ResponseError: S3ResponseError: 403 Forbidden RequestTimeTooSkewedThe difference between the request time and the current time is too large.2008-01-23T18:47:06Z439AD07AEA9A14C6900000a7RNB5bY7n1muIce2Ht40zcMUj4qkwJV6vtHdgWFzA6ezn3aov171Ov1UrHXWweZWed, 23 Jan 2008 01:31:49 GMT Exception exceptions.AttributeError: "s3_simple_engine instance has no attribute 'locker'" in > ignored

it means that local time should be adjusted to comply Amazon S3 requirement (max difference might be 10 minutes).

Extract time part from response 2008-01-23T18:47:06Z minus 5 hours, and use it to set time in sync. Run following command:

date --set="2008-01-23 13:47:43"

run again

root@khazJeosED:/opt/e# sudo elasticdrive /etc/elasticdrive.ini -d

if no errors, lets check it all goes well

#ps aux
...
root 5480 0.2 1.0 18176 5284 ? Ssl 13:59 0:00 /usr/bin/python /usr/bin/elasticdrive /etc/elasticdrive.ini -d
root 5482 0.2 1.0 18128 5348 ? Ssl 13:59 0:00 /usr/bin/python /usr/bin/elasticdrive /etc/elasticdrive.ini -d
root 5485 0.0 0.9 9660 4836 ? S 13:59 0:00 /usr/bin/python /usr/bin/elasticdrive /etc/elasticdrive.ini -d
root 5488 0.0 0.1 2620 1000 pts/0 R+ 14:00 0:00 ps aux

let's mount filesystem

root@khazJeosED:/opt/e# sudo mount -o loop /fuse2/ed0 /s3
mount: you must specify the filesystem type

this might mean that you need to run

#mke2fs -b 4096 /fuse2/ed0

mke2fs 1.40.2 (12-Jul-2007)
/fuse2/ed0 is not a block special device.Proceed anyway? (y,n) y
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
65536 inodes, 65536 blocks
3276 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=67108864
2 block groups
32768 blocks per group, 32768 fragments per group
32768 inodes per group
Superblock backups stored on blocks:
32768
Writing inode tables: done

Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 33 mounts or180 days, whichever comes first. Use tune2fs -c or -i to override.
root@khazJeosED:/opt/e#


now we can run the previous command:

root@khazJeosED:/opt/e# sudo mount -o loop /fuse2/ed0 /s3
root@khazJeosED:/opt/e#

Let check it went fine:

root@khazJeosED:/opt/e# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 2.8G 474M 2.2G 18% /
varrun 252M 44K 252M 1% /var/run
varlock 252M 0 252M 0% /var/lock
udev 252M 44K 252M 1% /dev
devshm 252M 0 252M 0% /dev/shm
/fuse2/ed0 248M 144K 236M 1% /s3

Monday, January 21, 2008

Running Selenium Grid on Amazon EC2

Synopsis:
This article describes a process of setting up and running Selenium Grid web testing tool in distributed fashion.

Pre-requisistes:
You should have an Amazon Elastic Compute Cloud account
It is implied, that you have basic Unix/Linux administration skills and some experience with Amazon EC2 tools
Some knowledge of Selenium testing ( read more at http://www.openqa.org/selenium-grid/ )


How to start rolling:
1. Run instance of AMI ami-fd37d294 and get Public DNS of the instance.

2. Login into ssh terminal, using Public DNS.

3. Run following commands:

#cd /usr/share/selenium-grid-0.9.3
#ant sanity-check


Instances of this AMI can be run as "hub" or "remote control" depending on your layout. If this is going to be a remote control:

#ant -Dport= -Dhost= -DhubURL= launch-remote-control
ant -Dport=5555 -Dhost=testhost -DhubURL=http://your-hub-url:4444 -Denvironment="Firefox on Linux" launch-remote-control


Future work:
- pass parameters before launch, so it runs in hub or rm mode, all neat and automated
- add Firefox on Windows and Safari on OS X profiles

Sunday, January 20, 2008

Geotourism in Kazakhstan



What is geotourism ?
National Geographic defines geotourism as tourism that sustains or enhances the geographical character of a place -its environment, culture, aesthetics, heritage, and the well-being of its residents. Geotourism incorporates the concept of sustainable tourism -that destinations should remain unspoiled for future generations- while allowing for enhancement that protects the character of the locale. Geotourism also adopts a principle from its cousin, ecotourism -that tourism revenue can promote conservation- and extends that principle beyond nature travel to encompass culture and history as well: all distinctive assets of a place.

Challenge
Nominate your favorite example of geotourism -- defined by National Geographic as tourism that sustains or enhances the geographical character of a place: its environment, heritage, culture, aesthetics, and the well-being of its residents.
All nominees will receive a personal email inviting them to enter the National Geographic-Ashoka's-Changemakers Geotourism Challenge launching on jan 30, 2008. All nominators who tell their first-person story become eligible to win one of two whl.travel prizes.
http://www.changemakers.net/en-us/competition/geotourism

My nomination - Falcon hunt in Central Kazakhstan

Hunting with golden eagles in Kazakhstan, is ancient and deeply esteemed tradition, and hunters-berkutchi are one of the most esteemed persons in a society.
The first world falconry festival took place on July14-15, 2007 in Reading, Berkshire, United Kingdom http://www.falconryfestival.com/
Festival gathered teams from 33 countries, including USA, France, Germany, Belgium, Japan, Brazil, China, South Korea, Turkmenistan, England and others. Kazakhstan team was recognized the best one.

If you are new to falconry, Alan Gates has good article about hunting with falcon at http://www.avmv20.dsl.pipex.com/Articles/The%20Hunt.htm

I would like to present my friend's falconry estate at Central Kazakhstan, that offers falcons, hawks and golden eagles to hunt wolf, fox, hare, black-cocks, ducks, partridges and pigeons. Since 1995 they hold international competitions “Salburn” - hunting with birds and Kazakh “tazy”- wolfhound, while riding horses.

For more details visit their website at http://www.pmicro.kz/~falcon/falcon_e/
or just give them a call:
Tileukabyl Esembekuly: 7 7212 45-41-02 , 7 300 244 65 69, tleu.e@rambler.ru
Karlygash Makataeva: 7 7272 71-26-17, 7 300 755 20 86, kmakataeva@yahoo.com
[UPDATE]

Thursday, January 10, 2008

Benchmarking Windows 2003 Server in Amazon EC2

Abstract: Some notes on trying to run Windows 2003 Server EE on Amazon EC2 extralarge type of instances (4 way).

SMP
I tried to use Qemu's -smp n feature on Amazon Elastic Compute Cloud, but it still uses only one processor. It is possible to set affinity and run each Qemu instance on individual processor, but this is very nontypical scenario. Also this article explains Qemu's "rdtsc" usage on SMP hosts http://lists.gnu.org/archive/html/qemu-devel/2007-03/msg00652.html

ramfs
Moving system disk to RAM decreased disk-related IO by 10-15%, but CPU-bound bottleneck still slowed system down.

ACPI SMP
Windows Device Manager showed that our installation uses MP kernel according to "ACPI Multiprocessor PC", ACPI APIC MP HAL (Halmacpi.dll).

I'll keep you informed on my other findings.

Monday, January 7, 2008

How to setup LDAP authentication in Alfresco

Synopsis: This articles describes a process of setting up an LDAP authentication in Afresco content management system.


OpenLDAP
Setting it up is pretty trivial, I used yum. It is important to add initial entries into fresh installation, since it comes totally empty and spits errors. This is how I did it:
#nano khaz.ldif
dn: dc=khaz-domain,dc=com
objectClass: domain
dc: khaz-domain

#slapadd -l khaz.ldif
#chown ldap:ldap /var/lib/ldap/objectClass.bdb
#/etc/rc.d/init.d/ldap restart

Atlassian Crowd
Crowd is a web-based single sign-on (SSO) tool that simplifies application provisioning and identity management. I used it as front-end tool for OpenLDAP to manage users.
Install the software and login into administration panel at something like http://yoursite.com:8095/crowd/console
choosing Directories tab and click on add directory. Choose "Crowd supports several connectors such as Active Directory, Sun ONE and Open Directory. " Connector button and fill in details (of OpenLDAP installation).

Alfresco
I used bundled version (tomcat + alfresco) with HSQL database, which might be switched to another one like MySQL.
Firstly, I tried it on my desktop in VMware server and then on Amazon EC2 instance, running under Fedora Core 6.
During initial stage I turned on debug mode to see exactly what was going on, and it really helped me to trace LDAP communication messages between my OpenLDAP server and Alfresco.
Use this settings as a guidance:
/opt/alfresco/tomcat/shared/classes/alfresco/extension/chaining-authentication-context.xml
/opt/alfresco/tomcat/shared/classes/alfresco/extension/ldap-authentication-context.xml

http://s3.amazonaws.com/khaz_download/chaining-authentication-context.xml

http://s3.amazonaws.com/khaz_download/ldap-authentication-context.xml

Adding users to Alfresco
Log in to Crowd panel.
Choose Principals tab > then OpenLDAPForAlfresco (this is how I named it, yours might have different name) in Directory dropdown and hit Search button
This should bring a list of users in directory. To add new user, locate Add Principal in Principal Browser tab and click on it. This will change to form, where you fill in user details and select proper directory for user to belong to.
Upon successful creation of user account, you can test it in Alfresco at http://youralfrescoinstallation.com:8080/alfresco At this point all users are managed outside of alfresco and might be easily attached to other services like single sign-on and OpenID.

Cloud Computing Google Group