Skip to content

Conversation

@zhujun98
Copy link
Contributor

Checklist

  • The title and commit message(s) are descriptive.
  • Small commits made to fix your PR have been squashed to avoid history pollution.
  • Tests have been added for new features or bug fixes.
  • API of new functions and classes are documented.

Description

@JohanMabille
Copy link
Member

I think the failures are due to missing overloads of isnan in xsimd. These functions were not called before your optimization.

I have on going work on xsimd to push these functions, I'll try to push them quickly so that we can merge this PR.

@zhujun98
Copy link
Contributor Author

I think the failures are due to missing overloads of isnan in xsimd. These functions were not called before your optimization.

I have on going work on xsimd to push these functions, I'll try to push them quickly so that we can merge this PR.

Thanks a lot @JohanMabille ! Actually, the unittest without SIMD failed too. I will investigate it in the meanwhile.

@zhujun98 zhujun98 force-pushed the improve_performance_of_nanvar_and_nanstd branch from bc16cdd to 622647d Compare April 7, 2021 21:01
@zhujun98 zhujun98 force-pushed the improve_performance_of_nanvar_and_nanstd branch from 622647d to d01be62 Compare May 31, 2021 09:51
return nanmean<result_type>(square(cast<result_type>(sc) - std::move(mrv)), std::forward<X>(axes), es);
// note: otherwise the result is wrong with 'immediate' evaluation strategy
auto sc_shifted = eval(cast<result_type>(sc) - std::move(mrv));
return nanmean<result_type>(square(sc_shifted), std::forward<X>(axes), es);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am completely lost and I summarize my findings here:

Test code:

xt::xarray<double> aN = {{ nanv, nanv, 123, 3 }, { 1, 2, nanv, 3 }, { 1, 1, nanv, 3 }}; xt::xarray<double> ret2 = xt::nanvar(aN, {1}); std::cout << "Lazy: " << ret2 << std::endl; xt::xarray<double> ret = xt::nanvar(aN, {1}, xt::evaluation_strategy::immediate); std::cout << "Immediate: " << ret << std::endl;
  • First case:
auto sc_shifted = eval(cast<result_type>(sc) - std::move(mrv)); return nanmean<result_type>(square(sc_shifted), std::forward<X>(axes), es);

Result with gcc8:

Lazy: { 2400. , 0.666667, 0.888889} Immediate: { 3600. , 0.666667, 0.888889} 

Result with Clang7: segfault

je 0x4198a0 ; <xt::xreducer_stepper<xt::xreducer_functors<xt::detail::nan_plus, xt::const_value<double>, xt::detail::nan_plus>, xt::xshared_expression<xt::xfunction<xt::detail::lambda_adapt<xt::square_fct>, xt::xarray_container<xt::uvector<double, std::allocator<double> >, (xt::layout_type)1, xt::svector<unsigned long, 4ul, std::allocator<unsigned long>, true>, xt::xtensor_expression_tag> const&> >, std::array<unsigned long, 1ul>, xt::reducer_options<double, std::tuple<xt::evaluation_strategy::lazy_type> > >::aggregate_impl(unsigned long, std::integral_constant<bool, false>) const+784>	shl $0x3,%rcx	neg %rax	mov %rdi,%rsi	nopw 0x0(%rax,%rax,1)	vmovsd (%rsi),%xmm0 
  • Second case:
return nanmean<result_type>(square(cast<result_type>(sc) - std::move(mrv)), std::forward<X>(axes), es);

Result with gcc8:

Lazy: { 3600. , 0.666667, 0.888889} Immediate: { 7365.388889, 4.666667, 12. } 

Result with clang7:

Lazy: { 3600. , 0.666667, 0.888889} Immediate: { 7365.388889, 12. , 363. } 

@JohanMabille @tdegeus Can you give some hint? Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants