Time – Page 6 – DECISION STATS

Why the West needs China to start moving towards cyber conflict

Hypothesis-

Western countries are running out of people to fight their wars. This is even more acute given the traditional and current demographic trends in both armed forces and general populations.

A shift to cyber conflict can help the West maintain parity over Eastern methods of assymetrical warfare (by human attrition /cyber conflict).

Declining resources will lead to converging conflicts of interest and dynamics in balance of power in the 21 st century.

Assumed Facts–

The launch of Sputnik by USSR led to the moon shot rush by the US.1960s

The proposed announcement of StarWars by USA led to unsustainable defence expenditure by USSR.1980s

The threat of cyber conflict and espionage by China (and Russian cyber actions in war with Georgia) has led to increasing budgets for cyber conflict research and defense in USA. -2010s

Assumptions–

If we do not learn from history, we are condemned to repeat it.

Declining Populations in the West and Rising Populations in the East in the 21 st century. The difference in military age personnel would be even more severe, due to more rapid aging in the west.

Economic output will be proportional to number of people employed as economies reach similar stages of maturity (Factor-Manufacturing-Services-Innovation)

Data-

http://esa.un.org/unpd/wpp/unpp/panel_population.htm

http://www.census.gov/population/international/

GDP projections to 2050:

http://www.guardian.co.uk/news/datablog/2011/jan/07/gdp-projections-china-us-uk-brazil

Summary-

Western defence forces would not be able to afford a human attrition intensive war by 2030 given current demographic trends (both growth and aging). Existing balance of power could be maintained if resources are either shared or warfare is moved to cyber space. Technological advances can help augment resources reducing case for conflict scenarios.

Will the Internet be used by US against China in the 21 st century as Opium was used by GB in the 19th? Time will tell 🙂

How to use Bit Torrents

I really liked the software Qbittorent available from http://www.qbittorrent.org/ I think bit torrents should be the default way of sharing huge content especially software downloads. For protecting intellectual property there should be much better codes and software keys than presently available.

The qBittorrent project aims to provide a Free Software alternative to µtorrent. Additionally, qBittorrent runs and provides the same features on all major platforms (Linux, Mac OS X, Windows, OS/2, FreeBSD).

qBittorrent is based on Qt4 toolkit and libtorrent-rasterbar.

qBittorrent v2 Features

Polished µTorrent-like User Interface

Well-integrated and extensible Search Engine

Simultaneous search in most famous BitTorrent search sites

Per-category-specific search requests (e.g. Books, Music, Movies)

All Bittorrent extensions

DHT, Peer Exchange, Full encryption, Magnet/BitComet URIs, …

Remote control through a Web user interface

Nearly identical to the regular UI, all in Ajax

Advanced control over trackers, peers and torrents

Torrents queueing and prioritizing

Torrent content selection and prioritizing

UPnP / NAT-PMP port forwarding support

Available in ~25 languages (Unicode support)

Torrent creation tool

Advanced RSS support with download filters (inc. regex)

Bandwidth scheduler

IP Filtering (eMule and PeerGuardian compatible)

IPv6 compliant

Sequential downloading (aka “Download in order”)

Available on most platforms: Linux, Mac OS X, Windows, OS/2, FreeBSD

So if you are new to Bit Torrents- here is a brief tutorial

Some terminology from

http://en.wikipedia.org/wiki/Glossary_of_BitTorrent_terms

Tracker

A tracker is a server that keeps track of which seeds and peers are in the swarm.

Seed

A Seed is used to refer to a peer who has 100% of the data. When a leech obtains 100% of the data, that peer automatically becomes a Seed.

Peer

A peer is one instance of a BitTorrent client running on a computer on the Internet to which other clients connect and transfer data.

Leech

A leech is a term with two meanings. Primarily leech (or leeches) refer to a peer (or peers) who has a negative effect on the swarm by having a very poor share ratio (downloading much more than they upload, creating a ratio less than 1.0)

1) Download and install the software from http://www.qbittorrent.org/

2) If you want to search for new files, you can use the nice search features in here

3) If you want to CREATE new bit torrents- go to Tools -Torrent Creator

4) For sharing content- just seed the torrent you just created. What is seeding – hey did you read the terminology in the beginning?

5) Additionally –

From

http://wiki.answers.com/Q/How_you_can_convert_a_file_into_a_torrent

Trackers: Below are some popular public trackers. They are servers which help peers to communicate.

Here are some good trackers you can use:

http://open.tracker.thepiratebay.org/announce
http://www.torrent-downloads.to:2710/announce
http://denis.stalker.h3q.com:6969/announce
udp://denis.stalker.h3q.com:6969/announce
http://www.sumotracker.com/announce

and

http://en.wikipedia.org/wiki/Glossary_of_BitTorrent_terms#Super-seeding

Super-seeding

When a file is new, much time can be wasted because the seeding client might send the same file piece to many different peers, while other pieces have not yet been downloaded at all. Some clients, like ABC, Vuze, BitTornado, TorrentStorm, and µTorrent have a “super-seed” mode, where they try to only send out pieces that have never been sent out before, theoretically making the initial propagation of the file much faster. However the super-seeding becomes less effective and may even reduce performance compared to the normal “rarest first” model in cases where some peers have poor or limited connectivity. This mode is generally used only for a new torrent, or one which must be re-seeded because no other seeds are available.

Note- you use this tutorial and any or all steps at your own risk. I am not legally responsible for any mishaps you get into. Please be responsible while being an efficient bit tor renter. That means respecting individual property rights.

Interview Michal Kosinski , Concerto Web Based App using #Rstats

Here is an interview with Michal Kosinski , leader of the team that has created Concerto – a web based application using R. What is Concerto? As per http://www.psychometrics.cam.ac.uk/page/300/concerto-testing-platform.htm

Concerto is a web based, adaptive testing platform for creating and running rich, dynamic tests. It combines the flexibility of HTML presentation with the computing power of the R language, and the safety and performance of the MySQL database. It’s totally free for commercial and academic use, and it’s open source

Ajay- Describe your career in science from high school to this point. What are the various stats platforms you have trained on- and what do you think about their comparative advantages and disadvantages?

Michal- I started with maths, but quickly realized that I prefer social sciences – thus after one year, I switched to a psychology major and obtained my MSc in Social Psychology with a specialization in Consumer Behaviour. At that time I was mostly using SPSS – as it was the only statistical package that was taught to students in my department. Also, it was not too bad for small samples and the rather basic analyses I was performing at that time.

My more recent research performed during my Mphil course in Psychometrics at Cambridge University followed by my current PhD project in social networks and research work at Microsoft Research, requires significantly more powerful tools. Initially, I tried to squeeze as much as possible from SPSS/PASW by mastering the syntax language. SPSS was all I knew, though I reached its limits pretty quickly and was forced to switch to R. It was a pretty dreary experience at the start, switching from an unwieldy but familiar environment into an unwelcoming command line interface, but I’ve quickly realized how empowering and convenient this tool was.

I believe that a course in R should be obligatory for all students that are likely to come close to any data analysis in their careers. It is really empowering – once you got the basics you have the potential to use virtually any method there is, and automate most tasks related to analysing and processing data. It is also free and open-source – so you can use it wherever you work. Finally, it enables you to quickly and seamlessly migrate to other powerful environments such as Matlab, C, or Python.

Ajay- What was the motivation behind building Concerto?

Michal- We deal with a lot of online projects at the Psychometrics Centre – one of them attracted more than 7 million unique participants. We needed a powerful tool that would allow researchers and practitioners to conveniently build and deliver online tests.

Also, our relationships with the website designers and software engineers that worked on developing our tests were rather difficult. We had trouble successfully explaining our needs, each little change was implemented with a delay and at significant cost. Not to mention the difficulties with embedding some more advanced methods (such as adaptive testing) in our tests.

So we created a tool allowing us, psychometricians, to easily develop psychometric tests from scratch an publish them online. And all this without having to hire software developers.

Ajay -Why did you choose R as the background for Concerto? What other languages and platforms did you consider. Apart from Concerto, how else do you utilize R in your center, department and University?

Michal- R was a natural choice as it is open-source, free, and nicely integrates with a server environment. Also, we believe that it is becoming a universal statistical and data processing language in science. We put increasing emphasis on teaching R to our students and we hope that it will replace SPSS/PASW as a default statistical tool for social scientists.

Ajay -What all can Concerto do besides a computer adaptive test?

Michal- We did not plan it initially, but Concerto turned out to be extremely flexible. In a nutshell, it is a web interface to R engine with a built-in MySQL database and easy-to-use developer panel. It can be installed on both Windows and Unix systems and used over the network or locally.

Effectively, it can be used to build any kind of web application that requires a powerful and quickly deployable statistical engine. For instance, I envision an easy to use website (that could look a bit like SPSS) allowing students to analyse their data using a web browser alone (learning the underlying R code simultaneously). Also, the authors of R libraries (or anyone else) could use Concerto to build user-friendly web interfaces to their methods.

Finally, Concerto can be conveniently used to build simple non-adaptive tests and questionnaires. It might seem to be slightly less intuitive at first than popular questionnaire services (such us my favourite Survey Monkey), but has virtually unlimited flexibility when it comes to item format, test flow, feedback options, etc. Also, it’s free.

Ajay- How do you see the cloud computing paradigm growing? Do you think browser based computation is here to stay?

Michal – I believe that cloud infrastructure is the future. Dynamically sharing computational and network resources between online service providers has a great competitive advantage over traditional strategies to deal with network infrastructure. I am sure the security concerns will be resolved soon, finishing the transformation of the network infrastructure as we know it. On the other hand, however, I do not see a reason why client-side (or browser) processing of the information should cease to exist – I rather think that the border between the cloud and personal or local computer will continually dissolve.

About

Michal Kosinski is Director of Operations for The Psychometrics Centre and Leader of the e-Psychometrics Unit. He is also a research advisor to the Online Services and Advertising group at the Microsoft Research Cambridge, and a visiting lecturer at the Department of Mathematics in the University of Namur, Belgium. You can read more about him at http://www.michalkosinski.com/

Predictive Models Ain’t Easy to Deploy

This is a guest blog post by Carole Ann Matignon of Sparkling Logic. You can see more on Sparkling Logic at http://my.sparklinglogic.com/

Decision Management is about combining predictive models and business rules to automate decisions for your business. Insurance underwriting, loan origination or workout, claims processing are all very good use cases for that discipline… But there is a hiccup… It ain’t as easy you would expect…

What’s easy?

If you have a neat model, then most tools would allow you to export it as a PMML model – PMML stands for Predictive Model Markup Language and is a standard XML representation for predictive model formulas. Many model development tools let you export it without much effort. Many BRMS – Business rules Management Systems – let you import it. Tada… The model is ready for deployment.

What’s hard?

The problem that we keep seeing over and over in the industry is the issue around variables.

Those neat predictive models are formulas based on variables that may or may not exist as is in your object model. When the variable is itself a formula based on the object model, like the min, max or sum of Dollar amount spent in Groceries in the past 3 months, and the object model comes with transaction details, such that you can compute it by iterating through those transactions, then the problem is not “that” big. PMML 4 introduced some support for those variables.

The issue that is not easy to fix, and yet quite frequent, is when the model development data model does not resemble the operational one. Your Data Warehouse very likely flattened the object model, and pre-computed some aggregations that make the mapping very hard to restore.

It is clearly not an impossible project as many organizations do that today. It comes with a significant overhead though that forces modelers to involve IT resources to extract the right data for the model to be operationalized. It is a heavy process that is well justified for heavy-duty models that were developed over a period of time, with a significant ROI.

This is a show-stopper though for other initiatives which do not have the same ROI, or would require too frequent model refresh to be viable. Here, I refer to “real” model refresh that involves a model reengineering, not just a re-weighting of the same variables.

For those initiatives where time is of the essence, the challenge will be to bring closer those two worlds, the modelers and the business rules experts, in order to streamline the development AND deployment of analytics beyond the model formula. The great opportunity I see is the potential for a better and coordinated tuning of the cut-off rules in the context of the model refinement. In other words: the opportunity to refine the strategy as a whole. Very ambitious? I don’t think so.

About Carole Ann Matignon

http://my.sparklinglogic.com/index.php/company/management-team

Carole-Ann Matignon

Carole-Ann Matignon – Co-Founder, President & Chief Executive Officer

She is a renowned guru in the Decision Management space. She created the vision for Decision Management that is widely adopted now in the industry. Her claim to fame is managing the strategy and direction of Blaze Advisor, the leading BRMS product, while she also managed all the Decision Management tools at FICO (business rules, predictive analytics and optimization). She has a vision for Decision Management both as a technology and a discipline that can revolutionize the way corporations do business, and will never get tired of painting that vision for her audience. She speaks often at Industry conferences and has conducted university classes in France and Washington DC.

She started her career building advanced systems using all kinds of technologies — expert systems, rules, optimization, dashboarding and cubes, web search, and beta version of database replication. At Cleversys (acquired by Kurt Salmon & Associates), she also conducted strategic consulting gigs around change management.

While playing with advanced software components, she found a passion for technology and joined ILOG (acquired by IBM). She developed a growing interest in Optimization as well as Business Rules. At ILOG, she coined the term BRMS while brainstorming with her Sales counterpart. She led the Presales organization for Telecom in the Americas up until 2000 when she joined Blaze Software (acquired by Brokat Technologies, HNC Software and finally FICO).

Her 360-degree experience allowed her to gain appreciation for all aspects of a software company, giving her a unique perspective on the business. Her technical background kept her very much in touch with technology as she advanced.

On a Hiatus

No blogging (except for interviews)

No poetry (unless I get really inspired and my scrapbook fills up)

No random internet browsing (except search for research)

Hell no, Facebook

–

No TV

No Movies

No goofing off

No wasting time using creative juice stewing as an excuse

–

Write the book

Write the book, dammit

Stanford Courses Delayed Again

Message from the guys at Palo Alto— Why dont they just make videos using Sal Academy’s help?

We’re sorry to have to tell you that our Machine Learning course will be delayed further. There have naturally been legal and administrative issues to be sorted out in offering Stanford classes freely to the outside world, and it’s just been taking time. We have, however, been able to take advantage of the extra time to debug and improve our course content!

We now expect that the course will start either late in February or early in March. We will let you know as soon as we hear a definite date. We apologize for the lack of communication in recent weeks; we kept hoping we would have a concrete launch date to give you, but that date has kept slipping.

Thanks so much for your patience! We are really sorry for repeatedly making you wait, and for any interference this causes in your schedules. We’re as excited and anxious as you are to get started, and we both look forward to your joining us soon in Machine Learning!

Andrew Ng and the ML Course Staff

Oracle launches its version of R #rstats

From-

http://www.oracle.com/us/corporate/press/1515738

Integrates R Statistical Programming Language into Oracle Database 11g

News Facts

Oracle today announced the availability of Oracle Advanced Analytics, a new option for Oracle Database 11g that bundles Oracle R Enterprise together with Oracle Data Mining.

Oracle R Enterprise delivers enterprise class performance for users of the R statistical programming language, increasing the scale of data that can be analyzed by orders of magnitude using Oracle Database 11g.

R has attracted over two million users since its introduction in 1995, and Oracle R Enterprise dramatically advances capability for R users. Their existing R development skills, tools, and scripts can now also run transparently, and scale against data stored in Oracle Database 11g.

Customer testing of Oracle R Enterprise for Big Data analytics on Oracle Exadata has shown up to 100x increase in performance in comparison to their current environment.

Oracle Data Mining, now part of Oracle Advanced Analytics, helps enable customers to easily build and deploy predictive analytic applications that help deliver new insights into business performance.

Oracle Advanced Analytics, in conjunction with Oracle Big Data Appliance, Oracle Exadata Database Machine and Oracle Exalytics In-Memory Machine, delivers the industry’s most integrated and comprehensive platform for Big Data analytics.

Comprehensive In-Database Platform for Advanced Analytics

Oracle Advanced Analytics brings analytic algorithms to data stored in Oracle Database 11g and Oracle Exadata as opposed to the traditional approach of extracting data to laptops or specialized servers.

With Oracle Advanced Analytics, customers have a comprehensive platform for real-time analytic applications that deliver insight into key business subjects such as churn prediction, product recommendations, and fraud alerting.

By providing direct and controlled access to data stored in Oracle Database 11g, customers can accelerate data analyst productivity while maintaining data security throughout the enterprise.

Powered by decades of Oracle Database innovation, Oracle R Enterprise helps enable analysts to run a variety of sophisticated numerical techniques on billion row data sets in a matter of seconds making iterative, speed of thought, and high-quality numerical analysis on Big Data practical.

Oracle R Enterprise drastically reduces the time to deploy models by eliminating the need to translate the models to other languages before they can be deployed in production.

Oracle R Enterprise integrates the extensive set of Oracle Database data mining algorithms, analytics, and access to Oracle OLAP cubes into the R language for transparent use by R users.

Oracle Data Mining provides an extensive set of in-database data mining algorithms that solve a wide range of business problems. These predictive models can be deployed in Oracle Database 11g and use Oracle Exadata Smart Scan to rapidly score huge volumes of data.

The tight integration between R, Oracle Database 11g, and Hadoop enables R users to write one R script that can run in three different environments: a laptop running open source R, Hadoop running with Oracle Big Data Connectors, and Oracle Database 11g.

Oracle provides single vendor support for the entire Big Data platform spanning the hardware stack, operating system, open source R, Oracle R Enterprise and Oracle Database 11g.

To enable easy enterprise-wide Big Data analysis, results from Oracle Advanced Analytics can be viewed from Oracle Business Intelligence Foundation Suite and Oracle Exalytics In-Memory Machine.

Supporting Quotes

“Oracle is committed to meeting the challenges of Big Data analytics. By building upon the analytical depth of Oracle SQL, Oracle Data Mining and the R environment, Oracle is delivering a scalable and secure Big Data platform to help our customers solve the toughest analytics problems,” said Andrew Mendelsohn, senior vice president, Oracle Server Technologies.

“We work with leading edge customers who rely on us to deliver better BI from their Oracle Databases. The new Oracle R Enterprise functionality allows us to perform deep analytics on Big Data stored in Oracle Databases. By leveraging R and its library of open source contributed CRAN packages combined with the power and scalability of Oracle Database 11g, we can now do that,” said Mark Rittman, co-founder, Rittman Mead.

http://www.oracle.com/us/products/database/options/advanced-analytics/index.html

Oracle Advanced Analytics — an option to Oracle Database 11g Enterprise Edition – extends the database into a comprehensive advanced analytics platform through two major components: Oracle R Enterprise and Oracle Data Mining. With Oracle Advanced Analytics, customers have a comprehensive platform for real-time analytic applications that deliver insight into key business subjects such as churn prediction, product recommendations, and fraud alerting.

Oracle R Enterprise tightly integrates the open source R programming language with the database to further extend the database with Rs library of statistical functionality, and pushes down computations to the database. Oracle R Enterprise dramatically advances the capability for R users, and allows them to use their existing R development skills and tools, and scripts can now also run transparently and scale against data stored in Oracle Database 11g.

Oracle Data Mining provides powerful data mining algorithms that run as native SQL functions for in-database model building and model deployment. It can be accessed through the SQL Developer extension Oracle Data Miner to build, evaluate, share and deploy predictive analytics methodologies. At the same time the high-performance Oracle-specific data mining algorithms are accessible from R.

	BENEFITS
	Scalability—Allows customers to easily scale analytics as data volume increases by bringing the algorithms to where the data resides – in the database Performance—With analytical operations performed in the database, R users can take advantage of the extreme performance of Oracle Exadata Security—Provides data analysts with direct but controlled access to data in Oracle Database 11g, accelerating data analyst productivity while maintaining data security Save Time and Money—Lowers overall TCO for data analysis by eliminating data movement and shortening the time it takes to transform “raw data” into “actionable information”

http://www.oracle.com/us/products/database/big-data-connectors/overview/index.html

Oracle R Hadoop Connector	Gives R users high performance native access to Hadoop Distributed File System (HDFS) and MapReduce programming framework.

This is a R package

From the datasheet at

http://www.oracle.com/technetwork/bdc/big-data-connectors/overview/ds-bigdata-connectors-1453601.pdf?ssSourceSiteId=ocomen

Please share:

qBittorrent v2 Features

Seed

Peer

Super-seeding

Please share:

Please share:

Carole-Ann Matignon – Co-Founder, President & Chief Executive Officer

Please share:

Please share:

Please share:

From-

Integrates R Statistical Programming Language into Oracle Database 11g

News Facts

Comprehensive In-Database Platform for Advanced Analytics

Supporting Quotes

BENEFITS

Please share: