Home » Posts tagged 'Solution'
Tag Archives: Solution
Using Rapid Miner and R for Sports Analytics #rstats
Ajay- Why did you choose Rapid Miner and R? What were the other software alternatives you considered and discarded?
Analyst- We considered most of the other major players in statistics/data mining or enterprise BI. However, we found that the value proposition for an open source solution was too compelling to justify the premium pricing that the commercial solutions would have required. The widespread adoption of R and the variety of packages and algorithms available for it, made it an easy choice. We liked RapidMiner as a way to design structured, repeatable processes, and the ability to optimize learner parameters in a systematic way. It also handled large data sets better than R on 32-bit Windows did. The GUI, particularly when 5.0 was released, made it more usable than R for analysts who weren’t experienced programmers.
Ajay- What analytics do you do think Rapid Miner and R are best suited for?
Analyst- We use RM+R mainly for sports analysis so far, rather than for more traditional business applications. It has been quite suitable for that, and I can easily see how it would be used for other types of applications.
Ajay- Any experiences as an enterprise customer? How was the installation process? How good is the enterprise level support?
Analyst- Rapid-I has been one of the most responsive tech companies I’ve dealt with, either in my current role or with previous employers. They are small enough to be able to respond quickly to requests, and in more than one case, have fixed a problem, or added a small feature we needed within a matter of days. In other cases, we have contracted with them to add larger pieces of specific functionality we needed at reasonable consulting rates. Those features are added to the mainline product, and become fully supported through regular channels. The longer consulting projects have typically had a turnaround of just a few weeks.
Ajay- What challenges if any did you face in executing a pure open source analytics bundle ?
Analyst- As Rapid-I is a smaller company based in Europe, the availability of training and consulting in the USA isn’t as extensive as for the major enterprise software players, and the time zone differences sometimes slow down the communications cycle. There were times where we were the first customer to attempt a specific integration point in our technical environment, and with no prior experiences to fall back on, we had to work with Rapid-I to figure out how to do it. Compared to the what traditional software vendors provide, both R and RM tend to have sparse, terse, occasionally incomplete documentation. The situation is getting better, but still lags behind what the traditional enterprise software vendors provide.
Ajay- What are the things you can do in R ,and what are the things you prefer to do in Rapid Miner (comparison for technical synergies)
Analyst- Our experience has been that RM is superior to R at writing and maintaining structured processes, better at handling larger amounts of data, and more flexible at fine-tuning model parameters automatically. The biggest limitation we’ve had with RM compared to R is that R has a larger library of user-contributed packages for additional data mining algorithms. Sometimes we opted to use R because RM hadn’t yet implemented a specific algorithm. The introduction the R extension has allowed us to combine the strengths of both tools in a very logical and productive way.
In particular, extending RapidMiner with R helped address RM’s weakness in the breadth of algorithms, because it brings the entire R ecosystem into RM (similar to how Rapid-I implemented much of the Weka library early on in RM’s development). Further, because the R user community releases packages that implement new techniques faster than the enterprise vendors can, this helps turn a potential weakness into a potential strength. However, R packages tend to be of varying quality, and are more prone to go stale due to lack of support/bug fixes. This depends heavily on the package’s maintainer and its prevalence of use in the R community. So when RapidMiner has a learner with a native implementation, it’s usually better to use it than the R equivalent.
RevoDeployR and commercial BI using R and R based cloud computing using Open CPU
Revolution Analytics has of course had RevoDeployR, and in a webinar strive to bring it back to center spotlight.
BI is a good lucrative market, and visualization is a strength in R, so it is matter of time before we have more R based BI solutions. I really liked the two slides below for explaining RevoDeployR better to newbies like me (and many others!)
Integrating R into 3rd party and Web applications using RevoDeployR
Please click here to download the PDF.
Here are some additional links that may be of interest to you:
- RevoDeployR web page: http://www.revolutionanalytics.com/products/enterprise-deployment.php
- RevoDeployR data sheet: http://www.revolutionanalytics.com/products/pdf/RevoDeployR.pdf
- RevoDeployR whitepaper: http://www.revolutionanalytics.com/why-revolution-r/whitepapers/DeployR_White_Paper.pdf
( I still think someone should make a commercial version of Jeroen Oom’s web interfaces and Jeff Horner’s web infrastructure (see below) for making customized Business Intelligence (BI) /Data Visualization solutions , UCLA and Vanderbilt are not exactly Stanford when it comes to deploying great academic solutions in the startup-tech world). I kind of think Google or someone at Revolution should atleast dekko OpenCPU as a credible cloud solution in R.
I still cant figure out whether Revolution Analytics has a cloud computing strategy and Google seems to be working mysteriously as usual in broadening access to the Google Compute Cloud to the rest of R Community.
Open CPU provides a free and open platform for statistical computing in the cloud. It is meant as an open, social analysis environment where people can share and run R functions and objects. For more details, visit the websit: www.opencpu.org
and esp see
https://public.opencpu.org/userapps/opencpu/opencpu.demo/runcode/
Jeff Horner’s
Jerooen Oom’s
-
/webapps
- /stockplot
- /lme4
- /ggplot2
- /puberty plot
- /IRT tool
Interview James G Kobielus IBM Big Data
Here is an interview with James G Kobielus, who is the Senior Program Director, Product Marketing, Big Data Analytics Solutions at IBM. Special thanks to Payal Patel Cudia of IBM’s communication team,for helping with the logistics for this.
Ajay -What are the specific parts of the IBM Platform that deal with the three layers of Big Data -variety, velocity and volume
James-Well first of all, let’s talk about the IBM Information Management portfolio. Our big data platform addresses the three layers of big data to varying degrees either together in a product , or two out of the three or even one of the three aspects. We don’t have separate products for the variety, velocity and volume separately.
Let us define these three layers-Volume refers to the hundreds of terabytes and petabytes of stored data inside organizations today. Velocity refers to the whole continuum from batch to real time continuous and streaming data.
Variety refers to multi-structure data from structured to unstructured files, managed and stored in a common platform analyzed through common tooling.
For Volume-IBM has a highly scalable Big Data platform. This includes Netezza and Infosphere groups of products, and Watson-like technologies that can support petabytes volume of data for analytics. But really the support of volume ranges across IBM’s Information Management portfolio both on the database side and the advanced analytics side.
For real time Velocity, we have real time data acquisition. We have a product called IBM Infosphere, part of our Big Data platform, that is specifically built for streaming real time data acquisition and delivery through complex event processing. We have a very rich range of offerings that help clients build a Hadoop environment that can scale.
Our Hadoop platform is the most real time capable of all in the industry. We are differentiated by our sheer breadth, sophistication and functional depth and tooling integrated in our Hadoop platform. We are differentiated by our streaming offering integrated into the Hadoop platform. We also offer a great range of modeling and analysis tools, pretty much more than any other offering in the Big Data space.
Attached- Jim’s slides from Hadoop World
Ajay- Any plans for Mahout for Hadoop
Jim- I cant speak about product plans. We have plans but I cant tell you anything more. We do have a feature in Big Insights called System ML, a library for machine learning.
Ajay- How integral are acquisitions for IBM in the Big Data space (Netezza,Cognos,SPSS etc). Is it true that everything that you have in Big Data is acquired or is the famous IBM R and D contributing here . (see a partial list of IBM acquisitions at at http://www.ibm.com/investor/strategy/acquisitions.wss )
Jim- We have developed a lot on our own. We have the deepest R and D of anybody in the industry in all things Big Data.
For example – Watson has Big Insights Hadoop at its core. Apache Hadoop is the heart and soul of Big Data (see http://www-01.ibm.com/software/data/infosphere/hadoop/ ). A great deal that makes Big Insights so differentiated is that not everything that has been built has been built by the Hadoop community.
We have built additions out of the necessity for security, modeling, monitoring, and governance capabilities into BigInsights to make it truly enterprise ready. That is one example of where we have leveraged open source and we have built our own tools and technologies and layered them on top of the open source code.
Yes of course we have done many strategic acquisitions over the last several years related to Big Data Management and we continue to do so. This quarter we have done 3 acquisitions with strong relevance to Big Data. One of them is Vivisimo (http://www-03.ibm.com/press/us/en/pressrelease/37491.wss ).
Vivisimo provides federated Big Data discovery, search and profiling capabilities to help you figure out what data is out there,what is relevance of that data to your data science project- to help you answer the question which data should you bring in your Hadoop Cluster.
We also did Varicent , which is more performance management and we did TeaLeaf , which is a customer experience solution provider where customer experience management and optimization is one of the hot killer apps for Hadoop in the cloud. We have done great many acquisitions that have a clear relevance to Big Data.
Netezza already had a massively parallel analytics database product with an embedded library of models called Netezza Analytics, and in-database capabilties to massively parallelize Map Reduce and other analytics management functions inside the database. In many ways, Netezza provided capabilities similar to that IBM had provided for many years under the Smart Analytics Platform (http://www-01.ibm.com/software/data/infosphere/what-is-advanced-analytics/ ) .
There is a differential between Netezza and ISAS.
ISAS was built predominantly in-house over several years . If you go back a decade ago IBM acquired Ascential Software , a product portfolio that was the heart and soul of IBM InfoSphere Information Manager that is core to our big Data platform. In addition to Netezza, IBM bought SPSS two years back. We already had data mining tools and predictive modeling in the InfoSphere portfolio, but we realized we needed to have the best of breed, SPSS provided that and so IBM acquired them.
Cognos- We had some BI reporting capabilities in the InfoSphere portfolio that we had built ourselves and also acquired for various degrees from prior acquisitions. But clearly Cognos was one of the best BI vendors , and we were lacking such a rich tool set in our product in visualization and cubing and so for that reason we acquired Cognos.
There is also Unica – which is a marketing campaign optimization which in many ways is a killer app for Hadoop. Projects like that are driving many enterprises.
Ajay- How would you rank order these acquisitions in terms of strategic importance rather than data of acquisition or price paid.
Jim-Think of Big Data as an ecosystem that has components that are fitted to particular functions for data analytics and data management. Is the database the core, or the modeling tool the core, or the governance tools the core, or is the hardware platform the core. Everything is critically important. We would love to hear from you what you think have been most important. Each acquisition has helped play a critical role to build the deepest and broadest solution offering in Big Data. We offer the hardware, software, professional services, the hosting service. I don’t think there is any validity to a rank order system.
Ajay-What are the initiatives regarding open source that Big Data group have done or are planning?
Jim- What we are doing now- We are very much involved with the Apache Hadoop community. We continue to evolve the open source code that everyone leverages.. We have built BigInsights on Apache Hadoop. We have the closest, most up to date in terms of version number to Apache Hadoop ( Hbase,HDFS, Pig etc) of all commercial distributions with our BigInsights 1.4 .
We have an R library integrated with BigInsights . We have a R library integrated with Netezza Analytics. There is support for R Models within the SPSS portfolio. We already have a fair amount of support for R across the portfolio.
Ajay- What are some of the concerns (privacy,security,regulation) that you think can dampen the promise of Big Data.
Jim- There are no showstoppers, there is really a strong momentum. Some of the concerns within the Hadoop space are immaturity of the technology, the immaturity of some of the commercial offerings out there that implement Hadoop, the lack of standardization for formal sense for Hadoop.
There is no Open Standards Body that declares, ratifies the latest version of Mahout, Map Reduce, HDFS etc. There is no industry consensus reference framework for layering these different sub projects. There are no open APIs. There are no certifications or interoperability standards or organizations to certify different vendors interoperability around a common API or framework.
The lack of standardization is troubling in this whole market. That creates risks for users because users are adopting multiple Hadoop products. There are lots of Hadoop deployments in the corporate world built around Apache Hadoop (purely open source). There may be no assurance that these multiple platforms will interoperate seamlessly. That’s a huge issue in terms of just magnifying the risk. And it increases the need for the end user to develop their own custom integrated code if they want to move data between platforms, or move map-reduce jobs between multiple distributions.
Also governance is a consideration. Right now Hadoop is used for high volume ETL on multi structured and unstructured data sources, or Hadoop is used for exploratory sand boxes for data scientists. These are important applications that are a majority of the Hadoop deployments . Some Hadoop deployments are stand alone unstructured data marts for specific applications like sentiment analysis like.
Hadoop is not yet ready for data warehousing. We don’t see a lot of Hadoop being used as an alternative to data warehouses for managing the single version of truth of system or record data. That day will come but there needs to be out there in the marketplace a broader range of data governance mechanisms , master data management, data profiling products that are mature that enterprises can use to make sure their data inside their Hadoop clusters is clean and is the single version of truth. That day has not arrived yet.
One of the great things about IBM’s acquisition of Vivisimo is that a piece of that overall governance picture is discovery and profiling for unstructured data , and that is done very well by Vivisimo for several years.
What we will see is vendors such as IBM will continue to evolve security features inside of our Hadoop platform. We will beef up our data governance capabilities for this new world of Hadoop as the core of Big Data, and we will continue to build up our ability to integrate multiple databases in our Hadoop platform so that customers can use data from a bit of Hadoop,some data from a bit of traditional relational data warehouse, maybe some noSQL technology for different roles within a very complex Big Data environment.
That latter hybrid deployment model is becoming standard across many enterprises for Big Data. A cause for concern is when your Big Data deployment has a bit of Hadoop, bit of noSQL, bit of EDW, bit of in-memory , there are no open standards or frameworks for putting it all together for a unified framework not just for interoperability but also for deployment.
There needs to be a virtualization or abstraction layer for unified access to all these different Big Data platforms by the users/developers writing the queries, by administrators so they can manage data and resources and jobs across all these disparate platforms in a seamless unified way with visual tooling. That grand scenario, the virtualization layer is not there yet in any standard way across the big data market. It will evolve, it may take 5-10 years to evolve but it will evolve.
So, that’s the concern that can dampen some of the enthusiasm for Big Data Analytics.
About-
You can read more about Jim at http://www.linkedin.com/pub/james-kobielus/6/ab2/8b0 or
follow him on Twitter at http://twitter.com/jameskobielus
You can read more about IBM Big Data at http://www-01.ibm.com/software/data/bigdata/
Facebook and R
Part 1 How do people at Facebook use R?
tamar Rosenn, Facebook
Itamar conveyed how Facebook’s Data Team used R in 2007 to answer two questions about new users: (i) which data points predict whether a user will stay? and (ii) if they stay, which data points predict how active they’ll be after three months?
For the first question, Itamar’s team used recursive partitioning (via the rpartpackage) to infer that just two data points are significantly predictive of whether a user remains on Facebook: (i) having more than one session as a new user, and (ii) entering basic profile information.
For the second question, they fit the data to a logistic model using a least angle regression approach (via the lars package), and found that activity at three months was predicted by variables related to three classes of behavior: (i) how often a user was reached out to by others, (ii) frequency of third party application use, and (iii) what Itamar termed “receptiveness” — related to how forthcoming a user was on the site.
source-http://www.dataspora.com/2009/02/predictive-analytics-using-r/
and cute graphs like the famous
https://www.facebook.com/notes/facebook-engineering/visualizing-friendships/469716398919

and
studying baseball on facebook
https://www.facebook.com/notes/facebook-data-team/baseball-on-facebook/10150142265858859
by counting the number of posts that occurred the day after a team lost divided by the total number of wins, since losses for great teams are remarkable and since winning teams’ fans just post more.

But mostly at
https://www.facebook.com/data?sk=notes and https://www.facebook.com/data?v=app_4949752878
and creating new packages
1. jjplot (not much action here!)
https://r-forge.r-project.org/scm/viewvc.php/?root=jjplot
though
I liked the promise of JJplot at
http://pleasescoopme.com/2010/03/31/using-jjplot-to-explore-tipping-behavior/
2. ising models
https://github.com/slycoder/Rflim
https://www.facebook.com/note.php?note_id=10150359708746212
3. R pipe
https://github.com/slycoder/Rpipe
even the FB interns are cool
Part 2 How do people with R use Facebook?
Using the API at https://developers.facebook.com/tools/explorer
and code mashes from
http://romainfrancois.blog.free.fr/index.php?post/2012/01/15/Crawling-facebook-with-R
http://applyr.blogspot.in/2012/01/mining-facebook-data-most-liked-status.html
but the wonderful troubleshooting code from http://www.brocktibert.com/blog/2012/01/19/358/
which needs to be added to the code first
and using network package
>access_token=”XXXXXXXXXXXX”
Annoyingly the Facebook token can expire after some time, this can lead to huge wait and NULL results with Oauth errors
If that happens you need to regenerate the token
What we need
> require(RCurl)
> require(rjson)
> download.file(url=”http://curl.haxx.se/ca/cacert.pem”, destfile=”cacert.pem”)
Roman’s Famous Facebook Function (altered)
> facebook <- function( path = “me”, access_token , options){
+ if( !missing(options) ){
+ options <- sprintf( “?%s”, paste( names(options), “=”, unlist(options), collapse = “&”, sep = “” ) )
+ } else {
+ options <- “”
+ }
+ data <- getURL( sprintf( “https://graph.facebook.com/%s%s&access_token=%s”, path, options, access_token ), cainfo=”cacert.pem” )
+ fromJSON( data )
+ }
Now getting the friends list
> friends <- facebook( path=”me/friends” , access_token=access_token)
> # extract Facebook IDs
> friends.id <- sapply(friends$data, function(x) x$id)
> # extract names
> friends.name <- sapply(friends$data, function(x) iconv(x$name,”UTF-8″,”ASCII//TRANSLIT”))
> # short names to initials
> initials <- function(x) paste(substr(x,1,1), collapse=”")
> friends.initial <- sapply(strsplit(friends.name,” “), initials)
This matrix can take a long time to build, so you can change the value of N to say 40 to test your network. I needed to press the escape button to cut short the plotting of all 400 friends of mine.
> # friendship relation matrix
> N <- length(friends.id)
> friendship.matrix <- matrix(0,N,N)
> for (i in 1:N) {
+ tmp <- facebook( path=paste(“me/mutualfriends”, friends.id[i], sep=”/”) , access_token=access_token)
+ mutualfriends <- sapply(tmp$data, function(x) x$id)
+ friendship.matrix[i,friends.id %in% mutualfriends] <- 1
+ }
Plotting using Network package in R (with help from the comments at http://applyr.blogspot.in/2012/01/mining-facebook-data-most-liked-status.html)
> require(network)
>net1<- as.network(friendship.matrix)
> plot(net1, label=friends.initial, arrowhead.cex=0)
(Rgraphviz is tough if you are on Windows 7 like me)
but there is an alternative igraph solution at https://github.com/sciruela/facebookFriends/blob/master/facebook.r
After all that-..talk.. a graph..of my Facebook Network with friends initials as labels..
Opinion piece-
I hope plans to make the Facebook R package get fulfilled (just as the twitteR package led to many interesting analysis)
and also Linkedin has an API at http://developer.linkedin.com/apis
I think it would be interesting to plot professional relationships across social networks as well. But I hope to see a LinkedIn package (or blog code) soon.
As for jjplot, I had hoped ggplot and jjplot merged or atleast had some kind of inclusion in the Deducer GUI. Maybe a Google Summer of Code project if people are busy!!
Also the geeks at Facebook.com can think of giving something back to the R community, as Google generously does with funding packages like RUnit, Deducer and Summer of Code, besides sponsoring meet ups etc.
(note – this is part of the research for the upcoming book ” R for Business Analytics”)
ps-
but didnt get time to download all my posts using R code at
https://gist.github.com/1634662#
or do specific Facebook Page analysis using R at
Updated-
#access token from https://developers.facebook.com/tools/explorer access_token="AAuFgaOcVaUZAssCvL9dPbZCjghTEwwhNxZAwpLdZCbw6xw7gARYoWnPHxihO1DcJgSSahd67LgZDZD" require(RCurl) require(rjson) # download the file needed for authentication http://www.brocktibert.com/blog/2012/01/19/358/ download.file(url="http://curl.haxx.se/ca/cacert.pem", destfile="cacert.pem") # http://romainfrancois.blog.free.fr/index.php?post/2012/01/15/Crawling-facebook-with-R facebook <- function( path = "me", access_token = token, options){ if( !missing(options) ){ options <- sprintf( "?%s", paste( names(options), "=", unlist(options), collapse = "&", sep = "" ) ) } else { options <- "" } data <- getURL( sprintf( "https://graph.facebook.com/%s%s&access_token=%s", path, options, access_token ), cainfo="cacert.pem" ) fromJSON( data ) } # see http://applyr.blogspot.in/2012/01/mining-facebook-data-most-liked-status.html # scrape the list of friends friends <- facebook( path="me/friends" , access_token=access_token) # extract Facebook IDs friends.id <- sapply(friends$data, function(x) x$id) # extract names friends.name <- sapply(friends$data, function(x) iconv(x$name,"UTF-8","ASCII//TRANSLIT")) # short names to initials initials <- function(x) paste(substr(x,1,1), collapse="") friends.initial <- sapply(strsplit(friends.name," "), initials) # friendship relation matrix #N <- length(friends.id) N <- 200 friendship.matrix <- matrix(0,N,N) for (i in 1:N) { tmp <- facebook( path=paste("me/mutualfriends", friends.id[i], sep="/") , access_token=access_token) mutualfriends <- sapply(tmp$data, function(x) x$id) friendship.matrix[i,friends.id %in% mutualfriends] <- 1 } require(network) net1<- as.network(friendship.matrix) plot(net1, label=friends.initial, arrowhead.cex=0)
Radoop 0.3 launched- Open Source Graphical Analytics meets Big Data
What is Radoop? Quite possibly an exciting mix of analytics and big data computing
What is Radoop?
Hadoop is an excellent tool for analyzing large data sets, but it lacks an easy-to-use graphical interface. RapidMiner is an excellent tool for data analytics, but its data size is limited by the memory available, and a single machine is often not enough to run the analyses on time. In this project, we combine the strengths of both projects and provide a RapidMiner extension for editing and running ETL, data analytics and machine learning processes over Hadoop.
We have closely integrated the highly optimized data analytics capabilities of Hive and Mahout, and the user-friendly interface of RapidMiner to form a powerful and easy-to-use data analytics solution for Hadoop.
and what’s new
Radoop 0.3 released – fully graphical big data analytics
Today, Radoop had a major step forward with its 0.3 release. The new version of the visual big data analytics package adds full support for all major Hadoop distributions used these days: Apache Hadoop 0.20.2, 0.20.203, 1.0 and Cloudera’s Distribution including Apache Hadoop 3 (CDH3). It also adds support for large clusters by allowing the namenode, the jobtracker and the Hive server to reside on different nodes.
As Radoop’s promise is to make big data analytics easier, the 0.3 release is also focused on improving the user interface. It has an enhanced breakpointing system which allows to investigate intermediate results, and it adds dozens of quick fixes, so common process design mistakes get much easier to solve.
There are many further improvements and fixes, so please consult the release notes for more details. Radoop is in private beta mode, but heading towards a public release in Q2 2012. If you would like to get early access, then please apply at the signup page or describe your use case in email (beta at radoop.eu).
Radoop 0.3 (15 February 2012)
- Support for Apache Hadoop 0.20.2, 0.20.203, 1.0 and Cloudera’s Distribution Including Apache Hadoop 3 (CDH3) in a single release
- Support for clusters with separate master nodes (namenode, jobtracker, Hive server)
- Enhanced breakpointing to evaluate intermediate results
- Dozens of quick fixes for the most common process design errors
- Improved process design and error reporting
- New welcome perspective to help in the first steps
- Many bugfixes and performance improvements
Radoop 0.2.2 (6 December 2011)
- More Aggregate functions and distinct option
- Generate ID operator for convenience
- Numerous bug fixes and improvements
- Improved user interface
Radoop 0.2.1 (16 September 2011)
- Set Role and Data Multiplier operators
- Management panel for testing Hadoop connections
- Stability improvements for Hive access
- Further small bugfixes and improvements
Radoop 0.2 (26 July 2011)
- Three new algoritms: Fuzzy K-Means, Canopy, and Dirichlet clustering
- Three new data preprocessing operators: Normalize, Replace, and Replace Missing Values
- Significant speed improvements in data transmission and interactive analytics
- Increased stability and speedup for K-Means
- More flexible settings for Join operations
- More meaningful error messages
- Other small bugfixes and improvements
Radoop 0.1 (14 June 2011)
Initial release with 26 operators for data transmission, data preprocessing, and one clustering algorithm.
Note that Rapid Miner also has a great R extension so you can use R, a graphical interface and big data analytics is now easier and more powerful than ever.
Cyber Cold War
I try to write on cyber conflict without getting into the politics of why someone is hacking someone else. I always get beaten by someone in the comments thread when I write on politics.
But recent events have forced me to update my usual “how-to” cyber conflict to “why” cyber conflict. This is because of a terrorist attack in my hometown Delhi.
(updated-
Iran allegedly tried (as per Israel) to assassinate the wife of Israeli Defence Attache in Delhi using a magnetic bomb, India as she went to school to pick up her kids, somebody else put a grenade in Israeli embassy car in Georgia which was found in time.
Based on reports , initial work suggests the bomb was much more sophisticated than local terrorists, but the terrorists seemed to have some local recce work done.
India has 0 history of antisemitism but this is the second time Israelis have been targeted since 26/11 Mumbai attacks. India buys 12 % of oil annually from Iran (and refuses to join the oil embargo called by US and Europe)
Cyber Conflict is less painful than conflict, which is inevitable as long as mankind exists. Also the Western hemisphere needs a moon shot (cyber conflict could be the Sputnik like moment) and with declining and aging populations but better technology, Western Hemisphere govts need cyber conflict as they are running out of humans to fight their wars. Eastern govt. are even more obnoxious in using children for conflict propaganda, and corruption.
Last week CIA.gov website went down
This week Iranian govt is allegedly blocking https traffic on eve of Annual Revolution Day (what a coincidence!)
Some resources to help Internet users in Iran (or maybe this could be a dummy test for the big one – hacking the great firewall of China)
News from Hacker News-
http://news.ycombinator.com/item?id=3575029
I’m writing this to report the serious troubles we have regarding accessing Internet in Iran at the moment. Since Thursday Iranian government has shutted down the https protocol which has caused almost all google services (gmail, and google.com itself) to become inaccessible. Almost all websites that reply on Google APIs (like wolfram alpha) won’t work. Accessing to any website that replies on https (just imaging how many websites use this protocol, from Arch Wiki to bank websites). Also accessing many proxies is also impossible. There are almost no official reports on this and with many websites and my email accounts restricted I can just confirm this based on my own and friends experience. I have just found one report here:
http://kabirnews.com/iran-shut-down-gmail-google-yahoo-and-sites-using-https-protocol/202/
The reason for this horrible shutdown is that the Iranian regime celebrates 1979 Islamic revolution tomorrow.
I just wanted to let you guys know about this. If you have any solution regarding bypassing this restriction please help!
The boys at Tor think they can help-
but its not so elegant, as I prefer creating a batch file rather than explain coding to newbies.
this is still getting to better and easier interfaces
https://www.torproject.org/projects/obfsproxy-instructions.html.en
Obfsproxy Instructions

Step 1: Install dependencies, obfsproxy, and Tor
You will need a C compiler (gcc), the autoconf and autotools build system, the git revision control system, pkg-config andlibtool, libevent-2 and its headers, and the development headers of OpenSSL.
On Debian testing or Ubuntu oneiric, you could do:
# apt-get install autoconf autotools-dev gcc git pkg-config libtool libevent-2.0-5 libevent-dev libevent-openssl-2.0-5 libssl-dev
If you’re on a more stable Linux, you can either try our experimental backport libevent2 debs or build libevent2 from source.
Clone obfsproxy from its git repository:
$ git clone https://git.torproject.org/obfsproxy.git
The above command should create and populate a directory named ‘obfsproxy’ in your current directory.
Compile obfsproxy:
$ cd obfsproxy
$ ./autogen.sh && ./configure && make
Optionally, as root install obfsproxy in your system:
# make install
If you prefer not to install obfsproxy as root, you can instead just modify the Transport lines in your torrc file (explained below) to point to your obfsproxy binary.
You will need Tor 0.2.3.11-alpha or later.
Step 2a: If you’re the client…
First, you need to learn the address of a bridge that supports obfsproxy. If you don’t know any, try asking a friend to set one up for you. Then the appropriate lines to your tor configuration file:
UseBridges 1
Bridge obfs2 128.31.0.34:1051
ClientTransportPlugin obfs2 exec /usr/local/bin/obfsproxy --managed
Don’t forget to replace 128.31.0.34:1051 with the IP address and port that the bridge’s obfsproxy is listening on.
Congratulations! Your traffic should now be obfuscated by obfsproxy. You are done! You can now start using Tor.
For old fashioned tunnel creation under Seas of English Channel-
http://dag.wieers.com/howto/ssh-http-tunneling/
- You can proxy to anywhere (see the Proxy directive in Apache) based on names
- You can proxy to any port you like (see the AllowCONNECT directive in Apache)
- It works even when there is a layer-7 protocol firewall
- If you enable proxytunnel ssl support, it is indistinguishable from real SSL traffic
- You can come up with nice hostnames like ‘downloads.yourdomain.com’ and ‘pictures.yourdomain.com’ and for normal users these will look like normal websites when visited.
- There are many possibilities for doing authentication further along the path
- You can do proxy-bouncing to the n-th degree to mask where you’re coming from or going to (however this requires more changes to proxytunnel, currently I only added support for one remote proxy)
- You do not have to dedicate an IP-address for sshd, you can still run an HTTPS site
Related-
http://opensourceandhackystuff.blogspot.in/2012/02/captive-portal-security-part-1.html
and some crypto for young people
http://users.telenet.be/d.rijmenants/en/onetimepad.htm
Me- What am I doing about it? I am just writing poems on hacking at http://poemsforkush.com












