Professional Documents
Culture Documents
On behalf of the JuliaCon organising committee, I want to announce that JuliaCon 2021 will be held online. While the end might be in sight for COVID-19 with
vaccine rollouts in some parts of the world, we are not confident that travel will be back to normal globally by next summer. Hence, we think it makes sense to
apply our learnings from this year towards another online conference.
The main conference will take place on Wednesday-Friday, July 28-30. Workshops will be held in the days and weeks prior, leading up to the main event. Like this
year, registration will be required, but tickets will be free.
The CFP will be released in mid January, with proposals due by mid March. We hope to continue to receive an impressive array of talk proposals, from beginners
and experienced Julia programmers alike.
I’m new to Julia, but already impressed by it’s syntax and by statsplots. However, some functionality is 1 / 12
still missing that I miss from Python/R. Nov 2020
Specifically for this post, I would like to color-code the output of clustering.hclust, as is seen in python/R,
so that it looks more like this:
Jan 15
Here is a minimally working example to get a distance matrix and dendrogram with group behavior.
using Clustering
using StatsPlots
using Distances
Considering I’m just starting to use julia, is this problem worth solving in pure julia? Is there a better way
to approach this? e.g., use a different plotting package/backend, or import python/R functions and not
waste time?
Now, here is how far I’ve gotten, and the direction of “pure julia” to get this to work. However, accessing
the line elements seems really unintuitive and was very difficult to even find.
function dendrocolor(dg)
for i=1:10000
try
dg[1][i].plotattributes[:linecolor] = RGBA{Float64}(0.8888735002725198,0.43564919034818994,0.2781229361419438,1.0)
catch e
break
end
end
end
This will change the entire dendrogram orange. I’m thinking I could
Use this function and call each line separately in the dendrogram
leverage hcl1 structure which contains merges and height
use the clustering.cutree function to accomplish this
However, I don’t know where to start parsing all that together, and it seems like a lot of work. Is there a
better way?
You want some nodes/lines to have different colors, and the user to specify those colors by themselves.
You can do that by passing a group argument to plot when you plot a Hclust object. Try passing a
random group vector of e.g. 4 different integers to plot, eg dg = plot(hcl1, group =
my_random_vector) and see what happens. The right length I guess should be the same as the
number of nodes in the dendrogram. Then the task reduces to finding the right way/order of specifying
the group vector.
ok, so if I use the cutree function I can get the cluster integers as a vector that matches the number of
branches (1000), and that’s what I would expect should go in the group argument.
This error is returned though, so I guess I don’t understand what this argument is looking for.
ERROR: BoundsError: attempt to access 4-element Array{Float64,1} at index [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10 … 192, 193, 194, 195,
Stacktrace:
[1] throw_boundserror(::Array{Float64,1}, ::Tuple{Array{Int64,1}}) at .\abstractarray.jl:541
[2] checkbounds at .\abstractarray.jl:506 [inlined]
[3] _getindex at .\multidimensional.jl:742 [inlined]
[4] getindex at .\abstractarray.jl:1060 [inlined]
[5] filter_data(::Array{Float64,1}, ::Array{Int64,1}) at C:\Users\Clay\.julia\packages\RecipesPipeline\uPBKQ\src\group.jl:46
[6] filter_data!(::Dict{Symbol,Any}, ::Array{Int64,1}) at C:\Users\Clay\.julia\packages\RecipesPipeline\uPBKQ\src\group.jl:51
[7] _filter_input_data!(::Dict{Symbol,Any}) at C:\Users\Clay\.julia\packages\RecipesPipeline\uPBKQ\src\group.jl:58
[8] _finish_userrecipe!(::Plots.Plot{Plots.GRBackend}, ::Array{Dict{Symbol,Any},1}, ::RecipeData) at C:\Users\Clay\.julia\package
[9] _process_userrecipes!(::Any, ::Any, ::Any) at C:\Users\Clay\.julia\packages\RecipesPipeline\uPBKQ\src\user_recipe.jl:34
[10] recipe_pipeline!(::Any, ::Any, ::Any) at C:\Users\Clay\.julia\packages\RecipesPipeline\uPBKQ\src\RecipesPipeline.jl:70
[11] _plot!(::Plots.Plot, ::Any, ::Any) at C:\Users\Clay\.julia\packages\Plots\vsE7b\src\plot.jl:172
[12] #plot#129 at C:\Users\Clay\.julia\packages\Plots\vsE7b\src\plot.jl:58 [inlined]
[13] top-level scope at none:1
I also made a Array{Array{Int64,1},1} of arbitrary indices in 9 groups that also didn’t work.
Ah OK, so something in the implementation of the recipe is thwarting the group argument -
@kevbonham I think you wrote that recipe - can you easily see how it could be modified to handle
group inside?
14 DAYS LATER
Oof - sorry I didn’t see this until just now . Somehow the notification got buried.
It’s been so long I don’t remember the implementation exactly, but I will try taking a look this week. I
think it would be awesome to have this functionality, but I don’t think it will be trivial. @alailink would
you please open an issue on Stats Plots and tag me (I’m @kescobo on GitHub)? I’m less likely to lose
track of it there. Sorry again for taking so long to get back to you
thanks for taking an interest in this problem! I opened an issue. Let me know if I can help you develop it
at all! I’m new to Julia but I’ve programmed in matlab/python/R.
Also, as a quick side note, graphing in Julia feels so slow. I’m using Atom+Juno+Gre. A large heatmap
takes ages to load on my fast computer, and “quick” graphs take seconds where matlab/python would
be instant. Any advice?
@kevbonham I really hope it shouldn’t be too involved, I guess the way grouping is implemented for
groupedboxplot could be used essentially?
@alailink that’s really strange, most plots should take around 100 ms to produce - which is not
instantaneous but should be fast enough. Could be a Juno thing? Atom Juno isn’t really developed
anymore
2 Replies
mkborregaard:
most plots should take around 100 ms to produce - which is not instantaneous but should be fast
enough.
Unless we’re talking about the first plot of course. It’s better than it was, but still slow for someone
coming from matlab/python/R
mkborregaard:
I really hope it shouldn’t be too involved, I guess the way grouping is implemented for
groupedboxplot could be used essentially?
I don’t think so - the plot object is sort of a custom construct. Will discuss more in the issue when I’ve
had a chance to look.
1 MONTH LATER
monty Jan 13
alailink Jan 14
Unfortunately this issue was surprisingly complicated. It has been moved to a longer-term project as a
github PR here: WIP group colors for hclust plot by alailink · Pull Request #411 · JuliaPlots/StatsPlots.jl
· GitHub 4
and statsmodels issue here: Dendrogram, add colors to groups · Issue #408 · JuliaPlots/StatsPlots.jl ·
GitHub 4
kevbonham Jan 15
Yeah, it’s a bit of a beast. Definitely will be worth having though. Don’t feel like you need to wait for
something that’s working - commit and push away - all of that history will be squashed before it’s
merged into the main branch. Assuming you’ve done anything - also totally understandable if you
haven’t .
Reply
Suggested Topics
Want to read more? Browse other topics in Visualization or view latest topics.