In 2009, I saw this post Becoming really rich with C# showcasing the new features in C# 4.5 and was impressed with C# with its hybrid Object-Functional approach and collection APIs to give collection operations a SQL-like feel:
var adjustedPrices =
e.Result
.Split(new[] { '\n' })
.Skip(1)
.Select(l => l.Split(new[] { ',' }))
.Where(l => l.Length == 7)
.Select(v => new Event(DateTime.Parse(v[0]), Double.Parse(v[6])));
Now lets do that in 7 lines of code in Java 5, 6, or 7. Er no, sorry.
At the time, I was learning Scala. So I translated Becoming really rich with C# into Scala and compared them side by side. Result: See http://quoiquilensoit.blogspot.com/2009/10/becoming-really-rich-with-scala.html The result surprised me. I thought C# held up pretty well overall.
So, a full four years later, Oracle owns Java and Java8 is out with some of the same features that C# was offering in dot net 4.5 in 2010. There is obvious missing stuff that Java 8 still does not have: LINQ, Output parameters. Vars. Tuples. Optional/Nullable numerics. But I tried the same exercise, trying to keep in the spirit of the C# code.
The code is on github: https://github.com/azzoti/get-rich-with-java8
git clone https://github.com/azzoti/get-rich-with-java8.git
Its an eclipse maven project, but you can run straight from the command line with:
mvn exec:java
(Make sure you have JDK 8 set up!)
Original C# | Java 8 See notes after the table |
---|---|
using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Net; using System.Threading; using System.Threading.Tasks; using System.IO; namespace ETFAnalyzer { struct Event { internal Event(DateTime date, double price) { Date = date; Price = price; } internal readonly DateTime Date; internal readonly double Price; } class Summary { internal Summary(string ticker, string name, string assetClass, string assetSubClass, double? weekly, double? fourWeeks, double? threeMonths, double? sixMonths, double? oneYear, double? stdDev, double price, double? mav200) { Ticker = ticker; Name = name; AssetClass = assetClass; AssetSubClass = assetSubClass; // Abracadabra ... LRS = (fourWeeks + threeMonths + sixMonths + oneYear) / 4; Weekly = weekly; FourWeeks = fourWeeks; ThreeMonths = threeMonths; SixMonths = sixMonths; OneYear = oneYear; StdDev = stdDev; Mav200 = mav200; Price = price; } internal readonly string Ticker; internal readonly string Name; internal readonly string AssetClass; internal readonly string AssetSubClass; internal readonly double? LRS; internal readonly double? Weekly; internal readonly double? FourWeeks; internal readonly double? ThreeMonths; internal readonly double? SixMonths; internal readonly double? OneYear; internal readonly double? StdDev; internal readonly double? Mav200; internal double Price; internal static void Banner() { Console.Write("{0,-6}", "Ticker"); Console.Write("{0,-50}", "Name"); Console.Write("{0,-12}", "Asset Class"); Console.Write("{0,4}", "RS"); Console.Write("{0,4}", "1Wk"); Console.Write("{0,4}", "4Wk"); Console.Write("{0,4}", "3Ms"); Console.Write("{0,4}", "6Ms"); Console.Write("{0,4}", "1Yr"); Console.Write("{0,6}", "Vol"); Console.WriteLine("{0,2}", "Mv"); } internal void Print() { Console.Write("{0,-6}", Ticker); Console.Write("{0,-50}", new String(Name.Take(48).ToArray())); Console.Write("{0,-12}", new String(AssetClass.Take(10).ToArray())); Console.Write("{0,4:N0}", LRS * 100); Console.Write("{0,4:N0}", Weekly * 100); Console.Write("{0,4:N0}", FourWeeks * 100); Console.Write("{0,4:N0}", ThreeMonths * 100); Console.Write("{0,4:N0}", SixMonths * 100); Console.Write("{0,4:N0}", OneYear * 100); Console.Write("{0,6:N0}", StdDev * 100); if (Price <= Mav200) Console.WriteLine("{0,2}", "X"); else Console.WriteLine(); } } class TimeSeries { internal readonly string Ticker; readonly DateTime _start; readonly Dictionary<DateTime, double> _adjDictionary; readonly string _name; readonly string _assetClass; readonly string _assetSubClass; internal TimeSeries(string ticker, string name, string assetClass, string assetSubClass, IEnumerable<event> events) { Ticker = ticker; _name = name; _assetClass = assetClass; _assetSubClass = assetSubClass; _start = events.Last().Date; _adjDictionary = events.ToDictionary(e => e.Date, e => e.Price); } bool GetPrice(DateTime when, out double price, out double shift) { // To nullify the effect of hours/min/sec/millisec being different from 0 when = new DateTime(when.Year, when.Month, when.Day); var found = false; shift = 1; double aPrice = 0; while (when >= _start && !found) { if (_adjDictionary.TryGetValue(when, out aPrice)) { found = true; } when = when.AddDays(-1); shift -= 1; } price = aPrice; return found; } double? GetReturn(DateTime start, DateTime end) { var startPrice = 0.0; var endPrice = 0.0; var shift = 0.0; var foundEnd = GetPrice(end, out endPrice, out shift); var foundStart = GetPrice(start.AddDays(shift), out startPrice, out shift); if (!foundStart || !foundEnd) return null; else return endPrice / startPrice - 1; } internal double? LastWeekReturn() { return GetReturn(DateTime.Now.AddDays(-7), DateTime.Now); } internal double? Last4WeeksReturn() { return GetReturn(DateTime.Now.AddDays(-28), DateTime.Now); } internal double? Last3MonthsReturn() { return GetReturn(DateTime.Now.AddMonths(-3), DateTime.Now); } internal double? Last6MonthsReturn() { return GetReturn(DateTime.Now.AddMonths(-6), DateTime.Now); } internal double? LastYearReturn() { return GetReturn(DateTime.Now.AddYears(-1), DateTime.Now); } internal double? StdDev() { var now = DateTime.Now; now = new DateTime(now.Year, now.Month, now.Day); var limit = now.AddYears(-3); var rets = new List<double>(); while (now >= _start.AddDays(12) && now >= limit) { var ret = GetReturn(now.AddDays(-7), now); rets.Add(ret.Value); now = now.AddDays(-7); } var mean = rets.Average(); var variance = rets.Select(r => Math.Pow(r - mean, 2)).Sum(); var weeklyStdDev = Math.Sqrt(variance / rets.Count); return weeklyStdDev * Math.Sqrt(40); } internal double? MAV200() { return _adjDictionary.ToList() .OrderByDescending(k => k.Key) .Take(200).Average(k => k.Value); } internal double TodayPrice() { var price = 0.0; var shift = 0.0; GetPrice(DateTime.Now, out price, out shift); return price; } internal Summary GetSummary() { return new Summary(Ticker, _name, _assetClass, _assetSubClass, LastWeekReturn(), Last4WeeksReturn(), Last3MonthsReturn(), Last6MonthsReturn(), LastYearReturn(), StdDev(), TodayPrice(), MAV200()); } } class Program { static string CreateUrl(string ticker, DateTime start, DateTime end) { return @"http://ichart.finance.yahoo.com/table.csv?s=" + ticker + "&a="+(start.Month - 1).ToString()+"&b="+start.Day.ToString()+"&c="+start.Year.ToString() + "&d="+(end.Month - 1).ToString()+"&e="+end.Day.ToString()+"&f="+end.Year.ToString() + "&g=d&ignore=.csv"; } static void Main(string[] args) { // If you rise this above 5 you tend to get frequent connection closing on my machine // I'm not sure if it is msft network or yahoo web service ServicePointManager.DefaultConnectionLimit = 10; var tickers = File.ReadAllLines("ETFTest.csv") .Skip(1) .Select(l => l.Split(new[] { ',' })) .Where(v => v[2] != "Leveraged") .Select(values => Tuple.Create(values[0], values[1], values[2], values[3])) .ToArray(); var len = tickers.Length; var start = DateTime.Now.AddYears(-2); var end = DateTime.Now; var cevent = new CountdownEvent(len); var summaries = new Summary[len]; for(var i = 0; i < len; i++) { var t = tickers[i]; var url = CreateUrl(t.Item1, start, end); using (var webClient = new WebClient()) { webClient.DownloadStringCompleted += new DownloadStringCompletedEventHandler(downloadStringCompleted); webClient.DownloadStringAsync(new Uri(url), Tuple.Create(t, cevent, summaries, i)); } } cevent.Wait(); Console.WriteLine("\n"); var top15perc = summaries .Where(s => s.LRS.HasValue) .OrderByDescending(s => s.LRS) .Take((int)(len * 0.15)); var bottom15perc = summaries .Where(s => s.LRS.HasValue) .OrderBy(s => s.LRS) .Take((int)(len * 0.15)); Console.WriteLine(); Summary.Banner(); Console.WriteLine("TOP 15%"); foreach(var s in top15perc) s.Print(); Console.WriteLine(); Console.WriteLine("Bottom 15%"); foreach (var s in bottom15perc) s.Print(); } static void downloadStringCompleted(object sender, DownloadStringCompletedEventArgs e) { var bigTuple = (Tuple<Tuple<string, string, string, string>, CountdownEvent, Summary[], int>)e.UserState; var tuple = bigTuple.Item1; var cevent = bigTuple.Item2; var summaries = bigTuple.Item3; var i = bigTuple.Item4; var ticker = tuple.Item1; var name = tuple.Item2; var asset = tuple.Item3; var subAsset = tuple.Item4; if (e.Error == null) { var adjustedPrices = e.Result .Split(new[] { '\n' }) .Skip(1) .Select(l => l.Split(new[] { ',' })) .Where(l => l.Length == 7) .Select(v => new Event(DateTime.Parse(v[0]), Double.Parse(v[6]))); var timeSeries = new TimeSeries(ticker, name, asset, subAsset, adjustedPrices); summaries[i] = timeSeries.GetSummary(); cevent.Signal(); Console.Write("{0} ", ticker); } else { Console.WriteLine("[{0} ERROR] ", ticker); summaries[i] = new Summary(ticker,name,"ERROR","ERROR",0,0,0,0,0,0,0,0); cevent.Signal(); } } } } |
package etf.analyzer; import static java.lang.System.out; import static java.util.Comparator.comparing; import static java.util.stream.Collectors.*; import java.io.IOException; import java.nio.file.*; import java.time.LocalDate; import java.time.format.DateTimeFormatter; import java.util.*; import java.util.Map.Entry; import java.util.concurrent.CountDownLatch; import java.util.stream.Stream; class Event { public Event(LocalDate date, double price) { this.date = date; this.price = price; } public LocalDate getDate() { return date; } public double getPrice() { return price; } private LocalDate date; private double price; } class Summary { public Summary(String ticker, String name, String assetClass, String assetSubClass, OptionalDouble weekly, OptionalDouble fourWeeks, OptionalDouble threeMonths, OptionalDouble sixMonths, OptionalDouble oneYear, OptionalDouble stdDev, double price, OptionalDouble mav200) { this.ticker = ticker; this.name = name; this.assetClass = assetClass; // this.assetSubClass = assetSubClass; // Abracadabra ... this.lrs = fourWeeks.add(threeMonths).add(sixMonths).add(oneYear).divide(OptionalDouble.of(4.0d)); this.weekly = weekly; this.fourWeeks = fourWeeks; this.threeMonths = threeMonths; this.sixMonths = sixMonths; this.oneYear = oneYear; this.stdDev = stdDev; this.mav200 = mav200; this.price = price; } private String ticker; private String name; private String assetClass; // private String assetSubClass; public OptionalDouble lrs; private OptionalDouble weekly; private OptionalDouble fourWeeks; private OptionalDouble threeMonths; private OptionalDouble sixMonths; private OptionalDouble oneYear; private OptionalDouble stdDev; private OptionalDouble mav200; private double price; static void banner() { out.printf("%-6s", "Ticker"); out.printf("%-50s", "Name"); out.printf("%-12s", "Asset Class"); out.printf("%4s", "RS"); out.printf("%4s", "1Wk"); out.printf("%4s", "4Wk"); out.printf("%4s", "3Ms"); out.printf("%4s", "6Ms"); out.printf("%4s", "1Yr"); out.printf("%6s", "Vol"); out.printf("%2s\n", "Mv"); } void print() { out.printf("%-6s", ticker); out.printf("%-50s", name); out.printf("%-12s", assetClass); out.printf("%4.0f", lrs.orElse(0.0d) * 100); out.printf("%4.0f", weekly.orElse(0.0d) * 100); out.printf("%4.0f", fourWeeks.orElse(0.0d) * 100); out.printf("%4.0f", threeMonths.orElse(0.0d) * 100); out.printf("%4.0f", sixMonths.orElse(0.0d) * 100); out.printf("%4.0f", oneYear.orElse(0.0d) * 100); out.printf("%6.0f", stdDev.orElse(0.0d) * 100); if (price <= mav200.orElse(-Double.MAX_VALUE)) out.printf("%2s\n", "X"); else out.println(); } } class TimeSeries { private String ticker; private LocalDate _start; private Map<LocalDate, Double> _adjDictionary; private String _name; private String _assetClass; private String _assetSubClass; public TimeSeries(String ticker, String name, String assetClass, String assetSubClass, List<Event> events) { this.ticker = ticker; this._name = name; this._assetClass = assetClass; this._assetSubClass = assetSubClass; this._adjDictionary = events.stream().collect(toMap(Event::getDate, Event::getPrice)); this._start = events.size() - 1 > 0 ? events.get(events.size() - 1).getDate() : LocalDate.now().minusYears(99); } private static final class FindPriceAndShift { public FindPriceAndShift(boolean found, double aPrice, int shift) { this.found = found; this.price = aPrice; this.shift = shift; } private boolean found; private double price; private int shift; } private FindPriceAndShift getPrice(LocalDate when) { boolean found = false; int shift = 1; double aPrice = 0.0d; while ((when.equals(_start) || when.isAfter(_start)) && !found) { if (found = _adjDictionary.containsKey(when)) { aPrice = _adjDictionary.get(when); } when = when.minusDays(1); shift -= 1; } return new FindPriceAndShift(found, aPrice, shift); } OptionalDouble getReturn(LocalDate start, LocalDate endDate) { FindPriceAndShift foundEnd = getPrice(endDate); FindPriceAndShift foundStart = getPrice(start.plusDays(foundEnd.shift)); if (!foundStart.found || !foundEnd.found) return OptionalDouble.empty(); else { return OptionalDouble.of(foundEnd.price / foundStart.price - 1.0d); } } private OptionalDouble lastWeekReturn() { return getReturn(LocalDate.now().minusDays(7), LocalDate.now()); } private OptionalDouble last4WeeksReturn() { return getReturn(LocalDate.now().minusDays(28), LocalDate.now()); } private OptionalDouble last3MonthsReturn() { return getReturn(LocalDate.now().minusMonths(3), LocalDate.now()); } private OptionalDouble last6MonthsReturn() { return getReturn(LocalDate.now().minusMonths(6), LocalDate.now()); } private OptionalDouble lastYearReturn() { return getReturn(LocalDate.now().minusYears(1), LocalDate.now()); } private Double sum(Collection<Double> d) { return d.parallelStream().reduce(0d, Double::sum); } private Double avg(Collection<Double> d) { return sum(d) / d.size(); } private OptionalDouble stdDev() { LocalDate now = LocalDate.now(); LocalDate limit = now.minusYears(3); List<Double> rets = new ArrayList<>(); while (now.compareTo(_start.plusDays(12)) >= 0 && now.compareTo(limit) >= 0) { OptionalDouble ret = getReturn(now.minusDays(7), now); rets.add(ret.orElse(0d)); now = now.minusDays(7); } Double mean = avg(rets); Double variance = avg(rets.parallelStream().map(r -> Math.pow(r - mean, 2)).collect(toList())); Double weeklyStdDev = Math.sqrt(variance); return OptionalDouble.of(weeklyStdDev * Math.sqrt(40)); } private OptionalDouble MAV200() { return OptionalDouble.of( _adjDictionary.entrySet().parallelStream() .sorted(comparing((Entry<LocalDate,Double> p) -> p.getKey()).reversed()) .limit(200).mapToDouble(e -> e.getValue()).average().orElse(0d) ); } private double todayPrice() { return getPrice(LocalDate.now()).price; } public Summary getSummary() { return new Summary(ticker, _name, _assetClass, _assetSubClass, lastWeekReturn(), last4WeeksReturn(), last3MonthsReturn(), last6MonthsReturn(), lastYearReturn(), stdDev(), todayPrice(), MAV200()); } } public class Program { static String createUrl(String ticker, LocalDate start, LocalDate end) { return "http://ichart.finance.yahoo.com/table.csv?s=" + ticker + "&a=" + (start.getMonthValue() - 1) + "&b=" + start.getDayOfMonth() + "&c=" + start.getYear() + "&d=" + (end.getMonthValue() - 1) + "&e=" + end.getDayOfMonth() + "&f=" + end.getYear() + "&g=d&ignore=.csv"; } public static void main(String[] args) throws IOException, InterruptedException { List<String[]> tickers = Files.lines(FileSystems.getDefault().getPath("ETFs.csv")) .skip(1) .parallel() .map(line -> line.split(",", 4)) .filter(v -> !v[2].equals("Leveraged")) .collect(toList()); int len = tickers.size(); LocalDate start = LocalDate.now().minusYears(2); LocalDate end = LocalDate.now(); CountDownLatch cevent = new CountDownLatch(len); Summary[] summaries = new Summary[len]; try (WebClient webClient = new WebClient()) { for (int i = 0; i < len; i++) { String[] t = tickers.get(i); final int index = i; webClient.downloadStringAsync(createUrl(t[0], start, end), result -> { summaries[index] = downloadStringCompleted(t[0], t[1], t[2], t[3], result); cevent.countDown(); }); } cevent.await(); } Stream<Summary> top15perc = Arrays.stream(summaries) .filter(s -> s.lrs.isPresent()) .sorted(comparing((Summary p) -> p.lrs.get()).reversed()) .limit((int)(len * 0.15)); Stream<Summary> bottom15perc = Arrays.stream(summaries) .filter(s -> s.lrs.isPresent()) .sorted(comparing((Summary p) -> p.lrs.get())) .limit((int)(len * 0.15)); System.out.println(); Summary.banner(); System.out.println("TOP 15%"); top15perc.forEach( s -> s.print()); System.out.println(); Summary.banner(); System.out.println("BOTTOM 15%"); bottom15perc.forEach( s -> s.print()); } public static Summary downloadStringCompleted(String ticker, String name, String asset, String subAsset, DownloadStringAsyncCompletedArgs e ) { Summary summary; if (e.getError() == null) { List<Event> adjustedPrices = Arrays.stream(e.getResult().split("\n")) .skip(1) .parallel() .map(line -> line.split(",", 7)) .filter(l -> l.length == 7) .map(v -> new Event(LocalDate.parse(v[0], DateTimeFormatter.ISO_LOCAL_DATE), Double.valueOf(v[6]))).collect(toList()); TimeSeries timeSeries = new TimeSeries(ticker, name, asset, subAsset, adjustedPrices); summary = timeSeries.getSummary(); } else { System.err.printf("[%s ERROR]", ticker); final OptionalDouble zero = OptionalDouble.of(0d); summary = new Summary(ticker, name, "ERROR", "ERROR", zero, zero, zero, zero, zero, zero, 0d, zero); } return summary; } } |
Some observations:
- The code depends on the yahoo to get historical stock prices and sometimes Yahoo is not available for stock prices. Wait five minutes and run the program again.
- The Java code is much much faster than the C# code, but it is going to yahoo to get historical stock prices which is going to be the limiting factor. I don't think the C# should be slower than the Java code but it is and I'm not sure why it is. I'm pretty sure the poor C# performance is to do with the dot net WebClient configuration but I might be wrong.
- In Java 8, just to show how easy it is, I've used parallelStream() and .parallel() in a couple of places, but these can be removed for the equivalent functionality. I can see no noticeable difference in performance with or without these calls when using an 8 core machine. As I said above I believe that the limiting factor is going to yahoo to get historical stock prices. There is not that much number crunching to do and I suspect the time taken to do it pales into insignificance next to the internet fetch time. Doing the calculations in parallel just isn't worth it. But its good to see how easy it is to parallelize work if you want to. Being able to simply say Collection.parallelStream() and Stream.parallel() is incredible if you find a sensible use case for it.
- The Java 8 code is a little longer than the C# code. In Java 7, I'm guessing the code would be at least two times longer and very very ugly if written in a similar style. The Java8 code is not as concise as C# or Scala but at least its in the same ball park. Partly this is due to Java POJO boilerplate (e.g. the FindPriceAndShift class and the Event class getter and setters) but thats is no big deal (IMO). The Java code is also more verbose because types must be declared unlike in C# where you can use "var" instead of a type declaration and usually the C# compiler infers what you mean.
- Tuples. C# has Tuples, Scala has Tuples but apparently their use is the spawn of satan and civilization will collapse if they are used in Java even to hold temporary results when parsing comma separated values into another class. (Oracle will be removing HashMap from Java9 apparently for similar reasons ;)) In order not to be arrested by the Java thought police I avoided succumbing to this. The C# code uses them, but I've managed to avoid them.
- Output parameters. In my scala translation in 2009, my translation to Scala used a return tuple instead of the C# output parameters (which I personally found confusing in the C# algorithm). In the Java 8 version I used a POJO FindPriceAndShift rather than sell my soul to wicked tuple monster.
- The C# code uses the "double?" type which is a double that can have an empty value and it means you can write LRS = (fourWeeks + threeMonths + sixMonths + oneYear) / 4 and any of fourWeeks, threeMonths, sixMonths, and oneYear can be empty without causing a null pointer exception etc. Java 8 does ship with OptionalDouble. But, strangely, you can't say a.add(b).add(c).divide(d). So I wrote an OptionalDouble class which does do this, so you can say lrs = fourWeeks.add(threeMonths).add(sixMonths).add(oneYear).divide(OptionalDouble.of(4.0d). If you look at the code you can see its almost trivially simple. Writing lrs = fourWeeks.add(threeMonths).add(sixMonths).add(oneYear).divide(OptionalDouble.of(4.0d) is not very pretty compared to the C# or Scala equivalent but a lot of Java people are used to doing this method chaining with BigDecimal: but with OptionalDouble now it can be null/emptyValue safe. (The same thing can easily be done to create a an OptionalBigDecimal class obviously.) (And this OptionalDouble stuff could easily have been done in Java7 too.)
- Java does not have a C# style WebClient, so I have taken the open source jetty http client and wrapped it in a simple wrapper to make it look like the C# WebClient. See git hub for the WbClient class.
- Java lives on open source. If the C# code is slow because the dot net WebClient is doing something stupid, its hard to find out as its closed source. If the Jetty's Java http client is broken, you can debug the source or switch to apache's http client: the best open source libraries emerge through natural selection. [Update: reaction from Reddit (I love reddit!): Sorry, that is pure bullshit. It is perfectly feasible to debug .Net Framework source code: http://msdn.microsoft.com/en-us/library/cc667410.aspx And no, it doesn't have a bug. They've been working on that for generations, and Microsoft puts serious money and has serious people working on stuff, as opposed to a bunch of unknown random hippie weed smokers financed by random coin slot donations. and even if java was faster it doesn't change the fact that it is a useless dinosaur which gets improvements 10 years after the rest of the mainstream languages. All that crappy bloated unmaintainable event-based async code can be converted to a beautiful sequence of
async / await
in C# 5.0, whereas you will probably not see anything like that in java in the next 20 years due to it's complete lack of evolution and retardedness.]
- There is some surprising missing functionality from the Stream and or Collections. There is no Zip or takeWhile or dropWhile for sequential streams. I'm guessing Java9, guava and others will fill this gap pretty fast.
- When I showed the code below to an experienced colleague who has only used Java <= 6 he said "that looks like C++ to me: thats completely unmaintainable". Sigh.
Stream<Summary> top15perc = Arrays.stream(summaries) .filter(s -> s.lrs.isPresent()) .sorted(comparing((Summary p) -> p.lrs.get()).reversed()) .limit((int)(len * 0.15));
No comments:
Post a Comment